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If quantum mechanics hasn’t profoundly shocked you, you 


haven't understood it yet. 
Niels Bohr 


Whether we like it or not, modern ways are going to alter 
and in part destroy traditional customs and values. 


Werner Heisenberg, 
Physics and Philosophy: 
The Revolution in Modern Science 


Contents 


Table of Contents v 
A preface of prefaces . .. oaoa aaa xi 
Introduction xvii 
Nature is quantized ................ xix 
Physics, mathematics and concepts ....... xxi 

| The journey: 

from classical to quantum worlds 

1.1 The gems of classical physics 5 
Mission almost completed ............ 5 
Newtonian mechanics and gravity ........ 7 
Fourlawsonly................ 7 
Dynamical systems ............. 11 
Conservation laws ............. 12 
Classical mechanics for aficionados .... 16 
x The shortest path x ... 18 
Maxwell's electromagnetism ........... 19 
The Maxwell equations ........... 21 
Electromagnetic waves ........... 26 
Lorentz invariance: the key to relativity .. 29 


Gauge invariance: beauty and redundance 33 
Monopoles: Nature’s missed opportunity? 37 
Statistical Physics: from micro to macro physics 42 


Thermodynamics: the three laws. ..... 42 
Understanding entropy............ 44 

x Two cultures x ... 47 
Statistical mechanics ............ 48 
Statistical thermodynamics. ........ 51 


The ideal gas. 


I.2 The age of geometry, information and quan- 
tum 
Canaries inacoalmine........... 
The physics of space-time 
Special relativity. .............. 
General relativity 
Big Bang cosmology 
Cosmicinflation............... 
% Much ado about nothing * 
The physics of geometry 
Curved spaces (manifolds) and topology 
The geometry of gauge invariance 
The physics of information: from bits to qubits 
Information and entropy 
Models of computation 
Going quantum 
Quantum physics: the laws of matter 


1.3 Universal constants, scales and units 
Is man the measure of all things? 

On time 
Reinventing the meter 

x When the saints go marching in...%« 


The virtue of heuristics 
Going quantum 
Natural units ©1898 Max Planck 
Black holes 
Black hole thermodynamics 
Accelerated observers and the Unruh effect 
The magic cube............... 


1.4 The quest for basic building blocks 
A splendid race to the bottom 
Fatal attraction: forces yield structure 
Atomic structure... ............... 

The Bohr atom: energy quantization... . 


103 
103 
106 
110 
115 


119 
119 
120 
121 
122 
125 
128 
128 
133 
138 
139 
141 
144 
147 


149 
149 
153 
156 
156 


vi CONTENTS 


The Schrödinger atom: three numbers .. 157 Il Quantessence: 
The discovery ofspin............ 161 how quantum theory works 
%* Behind the scenes * .. 162 
Fermions andbosons............ 163 Contents 239 


Atoms: the building blocks of chemistry . . 165 ; 
Nuclear structure .............00.. 166 l1 The quantum formalism: states 245 


Isotopes and nuclear decay modes .... 167 Quantum states: vectors in Hilbert space .... 246 
Positron-emission tomography (PET) 170 * Reader alert * -.. 246 
Transmutation: Fission and fusion .... . 170 Quantum versus classical ...... 2... 247 

* Chysopoeia? x 2. 172 The correspondence principle ....... 248 

ITER: the nuclear fusion reactor ..... . 175 Classical states: phase space .......... 249 

Field theory: particle species and forces... . . 176 The mechanics ofabit........... 250 
The Dirac equation: matter and anti-matter 177 Quantum states: Hilbertspace.......... 253 
Quantum Electrodynamics: QED... ... 182 States ofa quantumbit........... 254 
Subnuclear structure ............... 186 The scalar or dot product.......... 256 
The Standard Model ............ 186 A frame orbasis..............4. 257 
Flavors, colors andfamilies......... 186 The linear superposition principle .... . 258 

The strong interactions ........... 190 x Ultimate simplicity x .. 258 

The electro-weak interactions ....... 196 Ultimate simplicity: a single state system? . 258 

A brief history of unification. ........ 197 Qubit realizations .............. 263 
Supersymmetry ..............- 200 Entanglement................0.. 263 
Superstrings...........-.-.- 2.00040. 205 Multi-qubit states .............. 264 
Strings: all fields in one? .......... 207 Entangled states .............. 265 
M-theory, D-branes and dualities. ..... 217 Schrödingers cat .............. 266 
Holography and the AdS/CFT program .. 219 Entangled vs separable states ....... 268 
Athome inthe quantum world .......... 222 From separable to entangled and back .. 270 
Mixed versus pure states .......... 271 

Indices 225 The density operator ............ 273 
Subject index Volumel .............. 225 Quantum entropy .........--0-. 275 
Name index Volumel............... 230 Entanglement entropy ........... 275 
x Boizilla x .... 276 

Decoherence ................ 277 


ll.2 Observables, measurements and uncertainty 


281 

Quantum observables are operators. ...... 281 
Sample spaces and preferred states... . 283 

% Barbies on a globe * .. 285 

Spin or qubit Hamiltonians ......... 286 


Frames and observables .......... 287 


CONTENTS 


Unitary transformations. .......... 
Photon gates and wave plates ....... 
Incompatible observables. ......... 


Projection operators 
Raising and lowering operators 
Quantum measurement 
% Leaving a trace x% 

No cloning! 


The probabilistic outcome of measurements 


The projection postulate 


Quantum grammar: Logic and Syntax ...... 


x wavefunction collapse x% 


The case of a classical particle. ...... 


The case of a quantum particle 


The case of a quantum bit ......... 
Certain uncertainties ............... 


The Heisenberg uncertainty principle 
A sound analogy 


Heisenberg’s derivation. .......... 
Qubit uncertainties ............. 


x Vacuum energy * 
The breakdown of classical determinism 


Why does classical physics exist anyway? . 


I.3 Interference 


Classical wave theory and optics. ........ 


Basics of wave theory 


Reflection, transmission, etc......... 
Beamsplitters and polarization .......... 
Photon polarization: optical beamsplitters . 
Spin polarization: the Stern-Gerlach device 


x A Barbie’s choice * 


Interference: double slit experiments. ...... 
A basic interference experiment ...... 
A delayed choice experiment. ....... 
The Aharonov-Bohm phase. ........ 


The Berry phase 


Spin coupled to an external magnetic field. 
Probing the geometry of state space . . . . 


The Berry connection. 


Quantum tunnelling: magic moves. ....... 


ll.4 Teleportation and computation 


Entanglement and teleportation 
The Einstein—Podolsky—Rosen paradox . . 
The Bell inequalities 
Hiddennomore.............6. 
A decisive three photon experiment ... . 
Quantum teleportation ........... 

%* Superposition * 

Quantum computation 
Quantum gates and circuits 
Shor’s algorithm. .............. 
Applications and perspectives ....... 


l.5 Particles, fields and statistics 


Particle states and wavefunctions ........ 
Particle-wave duality ............ 
The space of particle states 
A particle on a circle 

Position and momentum operators. ....... 

Energy generates time evolution. ........ 
Wave mechanics: the Schrödinger equation 
Matrix mechanics: the Heisenberg equation 
Classical lookalikes ............. 
The harmonic oscillator 
Coherent states............... 

Fields: particle species .............. 

% The other currency * 

Particle spin and statistics ............ 
Indistinguishability 
Exclusion 
The topology of particle exchange ..... 
The spin-statistics connection 
Statistics: state counting .......... 
More for less: two-dimensional exotics 


1.6 Symmetries and their breaking 


Symmetries of what? ............ 
Symmetries and conserved quantities ...... 


vii 


viii 
The full symmetry of the hydrogen atom. . 425 
Symmetry algebra and symmetry group... . . 426 
Gauge symmetries ................ 429 
Non-abelian gauge theories ........... 432 
The Yang-Mills equations. ......... 435 
The symmetry breaking paradigm ........ 438 
The Brout—Englert—-Higgs (BEH) mechanism 443 
Symmetry concepts and terminology 446 
Indices 449 
Subject index Volume ll. ............. 449 
Name index Volume ll .............. 454 


CONTENTS 


lil Hierarchies: 
the emergence of diversity 


Contents 


lll.1 The structural hierarchy of matter 

Collective behavior and 

the emergence of complexity ... . 
The ascent of matter 
Molecularbinding................. 

The miraculous manifestations of carbon 

Nano physics 
The molecules of life 


lll.2 The splendid diversity of condensed matter 
Condensed states of matter 
Order versus disorder. ........... 
Magnetic order 
The Ising model............... 
x Swing states x 

Crystal lattices 
Crystalization and symmetry breaking ...... 
Liquid crystals.................-.. 
Quasicrystals 


lll.3 The electron collective 
Bands andgaps.................. 
Electron states in periodic potentials... . 
Semiconductors. 
Superconductivity................. 
The quantum Hall effect 
Topological order 


m4 SCA L E dependence 
Scaling in geometry . . . aooaa 
Self similarity and fractals 
The disc where Escher and Poincaré met . 
Scaling in dynamical systems 
The logistic map. .............. 
Scaling in quantum theory 


CONTENTS 


Quantum mechanics ............ 554 
Quantum field theory ............ 557 
The Euclidean path integral ........ 560 
Scaling and renormalization ........... 562 


% The quantum bank x .. 565 


Running coupling constants ........... 566 
Mechanical analogues ........... 566 
Gauge couplings .............. 569 


Grand unification: where strong joins weak 571 


Phase transitions .............. 572 

On the calculation of quantum corrections. . . . 573 
Perturbation theory ............. 573 
Quantum fluctuations in QED ....... 577 

A realistic example: Vacuum polarization 579 

The cut-off and the subtraction point . . . . 581 

lll.5 Power of the invisible 585 
Summary and outlook .............. 586 
The quantessence in retrospect.......... 587 
Three volumes. .............-..- 588 
Three layers. .. oaoa aaa 589 
Common denominators. .......... 592 
Scenarios for past and future. .......... 595 


The double helix of science and technology. 596 


Trees of knowledge ............. 597 

A Math Excursions 607 
& On functions, derivatives and integrals 607 

& On algebras ..............20.. 613 

Q On vectors and matrices . . .. ooa 614 

@ On vector calculus ............... 621 

& On probability and statistics .......... 626 

@ On complex numbers ............. 630 

Y On complex vectors and matrices ....... 632 

Q On symmetry groups.............. 635 

B Chronologies, ideas and people 643 
Indices 651 
Subject index Volume Ill ............. 651 
Name index Volume Ill .............. 655 


List of Figures. .................. 
List of Tables 


Acknowledgements 


About the author 


665 


665 


A preface of prefaces 


xi 


A preface of prefaces 


We all agree that your theory is crazy. The question 
which divides us is whether it is crazy enough to 
have a chance of being correct. 

Niels Bohr (addressing Wolfgang Pauli) 


The title Power of the invisible could cover a lot of pos- 
sible subjects, ranging from ordinary gossip to the most 
elevated of spiritual teachings, as well as from the Earth’s 
magnetic field to the invisible microcosmos. It underscores 
the plain fact that most things are actually invisible, unseen 
by the naked eye.The subtitle of this trilogy The quantes- 
sence of reality makes clear that in this book we limit our- 
selves to a world that is inaccessible to the human eye in 
a physical sense: A world that was only made visible hun- 
dreds of thousands of years after human history started 
through the development of science and technology. Hu- 
mans have always been aware of the sky and the heav- 
ens, but only relatively late did they realize that there was 
a universe as vast, diverse and mysterious on the inside of 
things. The title mainly refers to the power of that hidden 
microcosmos, and the tremendous forces that are at work 
within it. 


The word quantessence is a neologism which means ‘the 
quintessence of quantum, referring to phenomena that can 
only be explained in terms of quantum theory. A theory 
is a model, a symbolic representation of (a part of) the 
world and supposedly explains in a logically coherent way 
how that works. In that sense it is a visualization, an ab- 
stract reconstruction of that invisible microcosmos in terms 
of mathematical symbols and equations. And this is what 
most scientific explanations in the end tend to boil down to. 
And it is also this underlying network of relations and fun- 
damental principles which govern reality that represents 
the power of the invisible. 


The path towards such a model has been provided by 


an incredible interplay between science and technology, 
where ever more refined instruments were conceived and 
constructed to make discernible what was invisible before. 
In this way humankind has for millennia managed to push 
the boundaries of what is observable forward in an objec- 
tive sense. And that process has fundamentally changed 
the nature of human existence. That is how we became 
aware of the tremendous power of the invisible and the 
quantessence of reality. The beautiful phrase ‘Humans be- 
came aware, or learned about, or understood, covers up 
the sobering fact that the lucky humans who are referred 
to unfortunately form a tiny fraction of humankind: a nerdy 
caste of scientist, as high priests of scientific knowledge. 
They are a tiny fraction in spite of the fact that everybody is 
invited to come and share their collective wisdom by read- 
ing books or engaging otherwise. And that turns out to be 
not so easy. 


Scientific textbooks take pride in being as impersonal as a 
brick. It provides them with an aura of objectivity. Ques- 
tion: what do Bethe, Baym, Bohm, Davies, Dirac, Feyn- 
man, Greiner, Griffiths, Gottfried, Kemble, Kramers, Lan- 
dau, Leblond, Levy, Lipkin, Mandl, Martin, Matthews, Merz- 
bacher, Messiah, Mott, Omnès, Pauling, Schiff, Sakurai, 
Shankar, Tannoudji en Wey! have in common? Indeed, 
each of them has (co-)authored a textbook or two on quan- 
tum mechanics. Let me tell you how this works. If you have 
to teach a course on quantum theory, you can choose from 
more than fifty textbooks: an impressive oeuvre that bears 
witness to a profound love for our deepest knowledge. It 
doesn’t stop many a teacher from adding their own little 
masterpiece to it. For students it is often a great relief to 
discover that the overlap between these books is so im- 
mense, that complete bookcases in the library effectively 
shrink to a tiny pile of classics. If you’ve read one, you’ve 
read many. 


The personal view of the author usually becomes clear in 
their limited choice of subjects, and if everything is well, 
they should apologize for that in the Preface. That by itself 
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is not so exciting, in spite of being universal. Sometimes 
however — and that is what concerns us here — the Pref- 
ace has far more to say. It appears to be the only place 
where the author is allowed to make their personal views 
known, and indeed | must admit that those are harder to 
embed in a treatment of, say, angular momentum. In the 
preface the author may bare their soul. It may articulate 
the zeitgeist and even deteriorate into a manifesto of prin- 
ciples. The innocent looking preface may actually just be a 
hidden persuader for personal prejudices: a mission state- 
ment, which might amount to little more than the scientific 
equivalent of what politicians call corridor talk. Actually, it 
is a place where scientist publicly tell each other the truth. 
Therefore this ‘preface of prefaces’ is a virtual quantum 
dialog between some of the masters which is concocted 
from their outspoken prefaces. This is a small quantum 
correction to the immaculate status of some of our quan- 
tum classics. 


In 1924 the first version appeared of the standard work 
Methoden der Mathematische Physik by Courant and Hil- 
bert (this book evolved into the monumental work in two 
parts that was printed in 1938). It appeared in the Ger- 
man university city of Göttingen at the time when the men- 
tal landslide that quantum mechanics was took place. As 
a matter of fact the books covered classical mathemati- 
cal physics but treated the subject of differential equations 
and in particular eigenvalue problems in great detail, which 
then played a central role in solving for example the Schré- 
dinger equation. After Courant fled Germany, long before 
the Second World War, the Nazi’s blocked distribution of 
the book (as you may read in the preface to the 1953 edi- 
tion). Let me share a somber quote from the original 1924 
version: 


So kommt es dass viele Vertreter der Analysis das Be- 
wusstsein der ZusammengehGrigkeit ihrer Wissenschaft 
mit der Physik und anderen Gebieten verloren haben, 
wahrend auf die andere Seite oft den Physikern das 


Verständnis für die Probleme and Methoden der Mathe- 
matiker, ja sogar für deren ganze Interessensphare und 
Sprache abhanden gekommen ist. Ohne Zweifel liegt 

in dieser Tendenz eine Bedrohung für die ganze Wis- 
senschaft überhaupt; der Strom der wissentschaftlichen 
Entwicklung ist in Gefahr, sich weiter und weiter zu verasteln, 
zu versickern und auszutrocknen.! 


Courant therefore had no lack of drive to write a beautiful 
book. Another early classic (but in many ways modern) 
about quantum theory is The Principles of Quantum Me- 
chanics by Paul Dirac (first edition in 1930). He was well- 
known to be a man of few words: 


Mathematics is the tool especially suited for dealing with 
abstract concepts of any kind and there is no limit to its 
power in this field. For this reason a book on the new 
physics, if not purely descriptive of experimental work, 
must be essentially mathematical. 


The book then continues to present quantum theory in a 
form that he referred to as the ‘symbolic method’, a method 
used all over the place today: 


...| have chosen the symbolic method, introducing the 
representatives later merely as an aid to practical cal- 
culation. This has necessitated a complete break from 
the historical line of development, but this break is an 
advantage through enabling the approach to the new 


1As a result, many practitioners of mathematical analysis have lost 
the awareness of their science’s affiliation with physics and other fields, 
while on the other hand, physicists often have lost the understanding 
of the problems and methods of mathematicians, and indeed of their 
whole sphere of interest and language. There is no doubt that this 
trend poses a threat to the whole of science; the stream of scientific 
development is in danger of becoming more and more branched out, to 
seep away and to become dehydrated. 
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ideas to be made as direct as possible. 


The physicists who were of the opinion that Dirac’s ap- 
proach was too mathematical were silenced by the quite 
outspoken preface of the Mathematische Grundlagen der 
Quantenmechanik by John von Neumann (1932). The open- 
ing line makes it unambiguously clear what the goals are 
and what the standards to be maintained throughout:? 


Der Gegenstand dieses Buches ist die einheitliche, und, 
soweit als möglich und angebracht, mathematisch ein- 
wandfreie Darstellung der neuen Quantenmechanik.. . . 


And later on he even makes a compliment: 


Eine an Kürze und Eleganz kaum zu Uberbietende Darstel- 
lung der Quantenmechanik, die ebenfalls von invariantem 
Character ist, hat Dirac in mehreren Abhandlungen sowie 
in seinem kürzlich erschienenen Buche gegeben.’ 


that turns out to be a prelude to a less generous passage: 


Die erwähnte, infolge ihrer Durchsichtigkeit und Eleganz 
heute in einen grossen Teil der quantenmechanische 
Literatur übergegangene Methodik von Dirac wird den 
Anforderungen der mathematische Strenge in keiner Wei- 
se gerecht — auch dann nicht, wenn diese natürlicher- 
und billigerweise auf das sonst in der theoretischen Physik 


?The subject of this book is the unified, and as far as possible and 
appropriate, mathematically rigorously correct representation of the 
new quantum mechanics. 

3An account of quantum mechanics, which can hardly be surpassed 
in brevity and elegance, and which is also of an invariant character, 
has been given by Dirac in several papers as well as in his recently 
published book. 


übliche Mass reduziert werden.* 


Kramers in his Quantum Mechanics from 1937 holds a 
view rather orthogonal to Von Neuman’s, where he returns 
to the more heuristic, physically oriented approach of Bohr: 


The apparent lack of mathematical morals which is con- 
tritely pointed out repeatedly in the text is not exclusively 
due to the incompetence of the author. Physical morals, 
even (or rather especially) in their purest form, that is, 
unencumbered by pedagogic afterthoughts, do not live 
happily together with their mathematical relations in the 
restricted mansion of the human mind — and neither in 
the restricted volume of a monograph. 


The famous Russian physicists Landau and Lifschitz set 
their own magnificent standard in their course on Theo- 
retical Physics, which consists of more than ten volumes. 
These are the books from which our Russian colleagues 
loved to recite. If you got into a heavy-duty technical argu- 
ment with them, they would exclaim: ‘But don’t you know 
this? Is well-known exercise in the chapter five, of the vol- 
ume eight of the Landau Lifschitz!’ Little less than the So- 
viet equivalent of a bible, it managed quite well to spread 
its profound physics wisdom. The first edition dates back to 
1947. In the preface to volume three, Quantummechanik 
the authors note the following: 


“The methodology of Dirac mentioned above, which, owing to its 
transparency and elegance, has today been carried over to a large part 
of the quantum mechanical literature, does in no way justice to the re- 
quirements of mathematical rigor, even if the standard is lowered to the 
more natural and reasonable one typical for theoretical physics. 

51 apologize for quoting the German version which was for sale for a 
dollar or less in the former Soviet Union, at least on the rare occasions 
that it was not sold out. No easy reading because the formulas were 
set in Fraktur - the old German alphabet. 
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Man kann nicht umhin festzustellen, dass die Darstel- 
lung in vielen Lehrbüchern der Quantenmechanik kom- 
plizierter als in den Originalarbeiten ist. Obwohl eine 
solche Darstellung gewöhnlich mit grösserer Allgemein- 
heit und Strenge begründet wird, ist jedoch bei aufmerk- 
samer Betrachtung leicht zu erkennen, dass sowohl das 
eine wie die andere tatsächlich oft illusorisch sind, was 
sogar soweit geht, dass sich ein beträchtlicher Teil der 
‘strengen’ Sätze alsxfx fehlerhaft erweisst. Da uns eine 
solche komplizierte Darstellung völlig ungerechtfertigt 
erscheint, haben wir uns umgekehrt um denkbar mögli- 
che Einfachkeit bemüht und haben vielfach auf die Orig- 
inalarbeiten zurückgegriffen. 


David Bohm also regrets in the preface to his Quantum 
Theory from 1951 the loss of qualitative, imaginable phys- 
ical concepts. Bohm was well aware of the subtleties and 
essential role of the measurement process in quantum me- 
chanics. And it should be said that the whole arsenal 
of rather puzzling, if not controversial, Gedanken Exper- 
imente which have in the meantime descended into the 
blood, sweat and tears in the lab, form a vindication of his 
cry to further elucidate the fundamental concepts underly- 
ing the theory: 


So strong is this contrast [between classical and quan- 
tum physics] that an appreciable number of physicists 
were led to the conclusion that the quantum properties 
of matter imply a renunciation of the possibility of their 
being understood in the customary imaginative sense, 


ĉOne cannot help but notice that the presentation in many textbooks 
of quantum mechanics is more complicated than in the original works. 
Although such a statement is usually justified by greater generality and 
rigor, it is easy to see, after careful consideration, that both are often 
illusory, and this even goes so far as to state that a considerable part 
of the ‘rigorous’ statements prove to be faulty. As in our view such 
a complicated presentation appears to be completely unjustified, we 
have, conversely, tried to stay as simple as possible and have often 
resorted to the original works. 


and that instead, there remains only a self-consistent 
mathematical formalism which can, in some mysterious 
way, predict the numerical results of actual experiments. 
Nevertheless, ..., it finally became possible to express 
the results of the quantum theory in terms of compar- 
itively qualitative and imaginative concepts, which are, 
however of a totally different nature from those appear- 
ing in the classical theory. 


In this anthology we have to include the celebrated Feyn- 
man Lectures, as they form a most original and inspiring 
treatment of the theoretical basis of the physics curricu- 
lum.” To my knowledge it is also the first book written in 
first person reflecting his outspoken aversion to formality 
and distance. Therefore in his Lectures you will find reg- 
ularly statements that are unmistakably Mr. Feynman like 
(from Part Ill, Chapter 1: Quantum behavior): 


This would mean, if it were true, that physics has given 
up on the problem of trying to predict exactly what will 
happen in a definite circumstance. Yes! Physics has 
given up. 


In the preface the legendary teacher shows himself ac- 
countable for his pedagogical adventures (no need for the 
evaluation jungle that tends to stifle modern educational 
institutions): 


The question, of course, is how well this experiment 
succeeded. My own point of view — which, however, 
does not seem to be shared by most of the people who 
worked with the students — is pessimistic. | don’t think 
| did well by the students. When | look at the way the 


"The quite accessible first chapter of his book with Hibbs about 
Quantum mechanics and path integrals and his popular booklet called 
QED are also a must. 
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majority of the students handled the problems on the 
examinations, | think that the system is a failure. ... But 
then, ‘The power of instruction is seldom of much effi- 
cacy except in those happy circumstances where it is 
almost superfluous.’ (Gibbons) 


There are more recent attempts to pick up the innovative 
approach in the presentation of quantum mechanics, for 
example in the book Quantics of Lévy-Leblond and Bal- 
ibar. The term ‘quantique’ is apparently slang for ‘quantum 
mechanics’ used by French students. The English ver- 
sion ‘quantics’ has not seen a similar popularity among the 
youth educated in English, and if it is used, it is rather in 
in the world of data analysis and consultancy. There is a 
species of whizzkids called ‘quants’, who make money in 
investment banking. No quantum theory required. Yet. 


Nobody really dares to base an entire course in the spirit 
of these textbooks [the Feynman and Berkeley series], 
and often they are only used to breathe an extra bit of 
spirit (in some physical sense, let us say) into the tra- 
ditional abstract and scholastic way of teaching. The 
teaching method of Feynman and Wichman is not, after 
all, taken seriously. 


Further on in the preface we read: 


One often hears research workers expressing the de- 
sire to widen their professional culture, to deepen or re- 
juvenate their primary education. Such an aspiration 
does not come from an abstract desire to become gen- 
erally cultured. Rather, it reflects the desire to increase 
their ability to picture, interpret and understand physics 
— their physics. To satisfy this need, these researchers 
all too often have at their disposal daunting and sophis- 
ticated treatises, which they find intimidating, since they 


have the impression that they would only find abstract 
answers to their concrete questions. 


It was this exploration of prefaces that provided me with 
one of the principle motivations for writing this book. In the- 
oretical physics and quantum theory in particular, there is 
always a tension between mathematical rigor and physical 
understanding, between formal arguments and intuition, 
between abstract representations and physical reality. If 
we look back at the development of quantum theory, we 
see from observational evidence that classical physics was 
failing us; we had to first develop a mathematical frame- 
work for the quantum world. The physical intuitions, of 
which the physicists were so proud, were so deeply rooted 
in the classical experience that they led them completely 
astray in the quantum world and made the development of 
a suitable theory very hard. 


Today however, we are armed with the outcomes of a broad 
spectrum of real lab experiments that in the early quantum 
days only could be dreamt of as far-out gedanken experi- 
ments. There is a wide variety of quantum phenomena we 
have in the meantime become so ‘familiar’ with, that prac- 
titioners have developed a sort of quantum intuition — in 
the sense of adaption, being a healthy blend of experience 
and common sense. And, with that, a ‘quantum heuristics’ 
came into being — where whatever was considered eso- 
teric speculation before, kind of turned into a bunch of ‘no 
brainers’. This ‘quantum heuristics’ has at least informally 
gained some respectability and legitimacy. It is not quite so 
visible in textbooks but it is certainly predominantly present 
when physicists argue in front of their blackboards. | ex- 
pect that this perspective will percolate through in future 
quantum books. One might object that this may introduce 
even more quantum vagueness in our quantum conversa- 
tions. Apparently quantum uncertainties have made it all 
the way up to the heart of our ontology and epistemology, 
a remarkable recursion indeed. 
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A preface of prefaces 


This being said, you now know where | found the courage 
to produce yet another semi-popular book on quantum phys- 
ics and information. You need no longer ask: ‘Who ordered 
that?’. 


This book aims to demonstrate the ‘Power of the invisi- 
ble, where that power refers to the ‘essence’ or better the 
‘quintessence’ of quantum. This assumes that we, after 
more than a century of study, do know what the essence 
of quantum is. What we know for sure is that it is extremely 
powerful, in spite of being to a large extent concerned with 
the ‘invisible: 


Talking about the essence of something requires a certain 
depth, not just conveying facts, but creating the appropri- 
ate reference frames and language. This quantessential 
perspective will be presented in the following Introductory 
chapter which also provides a roadmap to this book. 


Complementary reading: 


— The Quantum Physicists 
W.H. Cropper 
Oxford University. Press (1970) 


8This is what Nobel laureate Isodor Rabi quipped in the mid-thirties, 
when informed about the discovery of the muon particle, a heavy 
brother of the electron that at that time seemed to have no purpose, 
and no reason to exist. 


Further reading. 
Some of the classics mentioned in this chapter: 


Methods of Mathematical Physics 
D. Hilbert and R. Courant 
Wiley-VCH; 2 Volumes (1989) 


The Principles of Quantum Mechanics 
P.A.M. Dirac 
Oxford University Press; 4th edition (1961) 


Mathematical Foundations of Quantum Mechan- 
ics: 

J. von Neumann 

Princeton Univers. Press; New edition (2018) 


Quantum Mechanics 
H.A. Kramers 
Dover Publications (1964) 


Quantum Mechanics (Non-Relativistic Theory) 
L.D. Landau, E.M. Lifshitz 
Pergamon Press; 3rd edition (1981) 


Quantum Theory 
D. Bohm 
Dover Publications Inc (1989) 


The Feynman Lectures on Physics 

R.P. Feynman (Author), R.B. Leighton (Contribu- 
tor), M. Sands (Contributor) 

Pearson P T R; (3 Volume Set) 1st edition (1970) 


Quantics: Rudiments of Quantum Physics 
J-M. Levy-Leblond F. Balibar 
North Holland (1990) 


Introduction 


When it comes to atoms, language can be used 
only as in poetry. The poet, too, is not nearly so 
concerned with describing facts as with creating 
images. 

Niels Bohr, ‘Atomic Physics and the Description of 
Nature’ (1934) 


In this introduction we show how the book is structured and 
give some advice on how to read it. 


Mathematics as a language of Nature. Quantum theory 
is known to be a difficult subject and becomes completely 
unfathomable if you have to rely entirely on our feeble nat- 
ural language to describe it. Therefore | hope that you will 
not be scared away by the book’s rather mathematical ap- 
pearance, particularly the second volume which looks as if 
itis full of equations. Don’t put the book aside just because 
of its intimidating appearance. Natural language is not the 
optimal means precisely because in quantum theory we 
enter realms of reality that are quite remote from our ev- 
eryday experiences and preconceptions. Our cherished 
‘common sense’ appeared to be of limited use and eas- 
ily led us astray. Some call the quantum world mysterious 
or alien, while others see iut as elusive or unfathomable; 
indeed one may easily get drowned if the message is com- 
municated to you in words only. 


Mathematics is here to rescue us; it allows us to construct 
smart and elegant notions that perfectly fit nature’s needs 
it comes with a beautifully efficient notation. The lengthy 
descriptions one would need in natural language to con- 
vey the essentials of quantum reality would too easily clut- 


The quantum leap. This art work called ‘The running knot’ is 
located in the city park of Kanazawa, Japan. (Source: Eryn Vorn, 
FLICKR) 


ter the mind and lead to the utmost confusion, as | have 
seen happening in quite a few ‘no formula’ expositions of 
the quantum world to the layperson. So there are ample 
reasons to be courageous and go ‘symbolic.’ 


Great narratives choose their own language. The heart 
of music is in the sound and a verbal substitute would not 
do. And as we all know, it takes guidance to learn how 
to hear what it expresses. The same is true for the vi- 
sual arts. It is hard to imagine a book about Picasso with- 
out pictures. And this is what Sagredo in the Dialogos of 
Galileo confided: ‘If | were again beginning my studies, 
| would follow the advice of Plato and start with mathe- 
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matics. Yes, the narrative of Nature expresses itself most 
eloquently in mathematics. So, we take Sagredo’s advice 
to heart and will gently introduce some of the quantessen- 
tial mathematical concepts along the way, but always in a 
rather pedestrian way’. Math, as a language of nature, as 
a means for understanding, but not as a purpose on its 
own. To that end | have included several so-called Math 
Excursions at the end of the third volume. These excur- 
sions explain in a user-friendly way what the math in the 
main text is about and will tell you all you need — but maybe 
never wanted — to know about matters like functions, com- 
plex numbers, matrices, algebras or vectors. Checking 
out these excursions will help you to get more out of this 
book. 


The best part of climbing a mountain is often the splendid 
view from the top. In a similar way we work our way up 
to some of the quantessential equations, not in praise of 
rigor, but in praise of clarity and beauty. | tell my students 
that equations love people and they better do because they 
owe their existence to them. Bearing that in mind, isn’t 
it amazing that this man-made language of mathematics 
turns out to be the most ‘natural’ after all? This fascinating 
fact inspired the famous mathematical physicist Eugene 
Wigner to write an interesting essay about this paradox ti- 
tled: ‘The unreasonable effectiveness of mathematics in 
the natural sciences.’ And as | intend to remain your travel 
ing companion all along the winding road to the quantum 
world, | hope that you will be patient with some of the math 
that we will encounter along the way. Think of it as the po- 
etry of reality: a sublime shorthand endowed with a built-in 
integrity. A minimal yet powerful representation of real- 
ity. There is some truth in what John von Neumann, as 
keynote speaker at the first national meeting of the As- 
sociation for Computing Machinery in 1947, quipped: ‘If 
people do not believe that mathematics is simple, it is only 


? As we will indeed encounter many of the fundamental equations of 
physics along the way, the interested reader who is not at all versed in 
these equations may want to look at my popular science book entitled 
The Equations: icons of knowledge (Harvard University Press, 2005). 


Figure 1: Adinkra symbol. Adinkras are symbols of the people 
of the Ashanti Kingdom in West Africa (Ghana) that represent 
concepts or wise sayings (aphorisms). This adinkra is called 
‘nea onnim no sua a, ohu, which translates as ‘he/she who does 
not know can become knowledgeable through learning.’ | hap- 
pen to see many interlocked copies of the letter ‘E’ , from Edu- 
cation, a striking coincidence! 

S. James Gates, Complex ideas, complex shapes (2012) 


because they do not realize how complicated life is.’ 


To whom am | talking? One of the first questions a poten- 
tial publisher will throw at you as potential author is about 
who your perceived audience is. Who is going to read (or 
rather, buy) this book? So many pages, so many topics, 
so many equations, who the hell do you think..../f you can- 
not kill your darlings they will kill you! My answer is en- 
crypted in the symbolic aphorism depicted in Figure 1 say- 
ing: ‘he/she who does not know can become knowledge- 
able through learning’. Keeping in mind that this holds true 
for basically everybody, it stands for the notion of education 
permanente, which advocates a broader spectrum of con- 
ceivable audiences for books on knowledge. There is the 
questionable dichotomy that books about science have, for 
some reason, to belong to either the categories ‘popular’ or 
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‘textbook’, with basically nothing in between. From my ex- 
perience of teaching science to all kinds of people, | know 
that there are many diverse audiences between those of 
laypeople and Harvard graduates. And these present us 
with a need for books that try to bridge the intellectual 
pseudo gap | just mentioned. And with the availability of 
internet sources like Wikipedia and Youtube there is still a 
clear need for in-between books that give a broad coher- 
ent account with some theoretical depth. My hope is that 
this book provides an example thereof. So who are the 
would-be members in this perceived audience: students of 
various backgrounds and disciplines, from motivated high 
school whizzkids to multidisciplinary college students, as 
well as their teachers. | think of students in the disciplines 
neighbouring physics in the natural sciences, as well as 
engineering, mathematics and information science. | think 
of journalists and of the growing group of seniors who fi- 
nally have time to get to grips with some of the deep scien- 
tific subjects that over the last century through technologi- 
cal developments have so radically transformed the world 
around them. | dedicate this work to the bright young peo- 
ple throughout the world who share that insatiable hunger 
for true knowledge and | hope that it will inspire their hon- 
orable quest. Students tend to be overwhelmed by the 
‘how to’ questions, which means that the ‘why’ and ’what 
does it mean’ questions are neglected. Let me close this 
little pitch with a quote from the early muslim polymath Al 
Kindi!®, who lived around 850 AD: 


We should not be ashamed of recognizing truth 
and assimilating it from whatever source it may reach 
us, even though it might come from earlier gener- 
ations or foreign peoples. For him who seeks truth 
there is nothing of more value than truth itself. It 
never cheapens or abases him who searches for 
it, but ennobles and honors him. 


WAI Kindi wrote more then 250 books. His Manuscript on Deci- 
phering Cryptographic Messages, in which he laid the foundation of 
crypto-analysis using statistical interference and frequency analysis, is 
remarkable. 


Nature is quantized 


Quantum theory is not a theory of one particular system 
like the atom; it is a set of universal principles that applies 
to all of nature. 

We present an overview of how this elaborate field is struc- 
tured as a whole and thereby motivate the lay out of the 
book. 


Quantum theory is based on a set of fundamental princi- 
ples that nature appears to obey at basically all scales and 
therefore underlies all of physics, and more indirectly also 
all of chemistry and biology. The dictum is ‘One Nature, 
One Science’. Deep down all physical theories have to be- 
have according to the quantum rules and therefore all our 
theories have to be ‘quantized, somewhat like kids have to 
be potty-trained, and dogs have to go to obedience school 
to learn not to bark. The quantum postulates forced us 
to reinvent the whole of fundamental physics from a new 
conceptual basis. We have quite successfully quantized 
particles and mechanics, electrodynamics including op- 
tics, and liquids, solids and other condensed forms of mat- 
ter. But also unified theories describing subnuclear phys- 
ics have been successfully quantized and led to the cele- 
brated Standard Model. And finally, not so long ago, we 
realized that even information should be quantized. This 
Ongoing quantization process has led to a much deeper 
understanding of the fundamental structure of nature, but 
also to a huge number of breathtaking applications and 
quantum technologies that have only just started to take 
off. Indeed, technologies involving quantum information 
processing are expected to generate a highly disruptive 
transition with a huge socio-economic impact. Yet, this 
having been said, there are still many fundamental chal- 
lenges, like the quantum interpretation of gravity, the oldest 
known force, which are required to be tackled in order to 
understand the origins of the universe or how black holes 
work. 
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Figure 2: Three volumes. Quantum theory was introduced to physics at the atomic level. From there it started spreading into the 
other levels of physics, at both larger and smaller scales. 
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Three volumes. Quantum theory basically originated at 
the level of the atom, and by modern standards that is 
an intermediate length scale. From there the applications 
of the basic theory developed in two opposite directions, 
as indicated in the left column of Figure 2. On the one 
hand to ever smaller distances, all the way down to mod- 
ern particle physics, and on the other hand to ever larger 
distances, moving up all the way to modern (bio)chemistry 
and condensed matter physics. The arrows pointing up- 
ward underscore the basic fact that quantum effects are 
by no means restricted to the microscopic domain. There 
are many research fields devoted to the study of quantum 
principles on macroscopic scales, which amounts to ap- 
plying quantum theory to collective phenomena. In that 
sense every cell phone is full of quantum. 


Even though | will restrict myself to the ‘quantessentials’, 
the subject is so vast that the book is divided into three 
parts, i.e. volumes, which — as also indicated in Figure 2 — 
can be characterized as follows. 


The first volume of the book, called The journey: from clas- 
sical to quantum worlds, starts with the highlights of clas- 
sical physics and informatics after which it descends into 
the quantum world. It is the narrative guided by man’s pas- 
sionate quest for the most basic building blocks of nature 
and their interactions. We start with marbles and end up 
with quarks and even superstrings. 


In the second volume of the book, called Quantessence: 
how quantum theory works, we delve deeper into the struc- 
ture of the theory and present some of its mathematical 
representations. And we will talk about the conceptual is- 
sues concerning quantum states, observables and mea- 
surements that we encounter along the way. There we will 
be concerned extensively with mind-boggling notions like 
entanglement, particle interference and quantum telepor- 
tation. 


In the third and final volume called Hierarchies: the emer- 


gence of diversity, we discuss quantum theory as it ap- 
plies to the structural hierarchy of matter from the atomic 
level to chemistry and the quantum physics of condensed 
states of matter. We not only consider the hierarchy in a 
spatial sense but also how that hierarchy arose in a tem- 
poral sense during the early stages of cosmic evolution. It 
closes with a chapter on scaling, discussing notions such 
as self-similarity, scale invariance and renormalization of 
theories in order to understand their asymptotic behavior if 
one imagines the behavior of theories as models of nature, 
at ever smaller or larger scales. We conclude this quantum 
trilogy by offering a concluding chapter with a more general 
science-driven perspective. 


Physics, mathematics and concepts 


If you look long enough, anything becomes abstract 
Diane Arbus 


This section presents a meta-perspective on how to read 
this quantessential book. The quantum world can be tra- 
versed in many ways, all pertaining to a certain ‘logic’. Tak- 
ing a single path will enlighten certain aspects but may ob- 
scure others. Therefore it is better to combine different 
paths to get an optimal feeling for the quantum landscape. 
To get to the quantessence, one would have to add up the 
contributions of all the different paths”! . 


Once a field of science (like physics) has matured suf- 
ficiently, one can learn something interesting about the 
structure of scientific knowledge in general. This is indi- 
cated in the layered structure of quantum knowledge in the 
scheme of Figure 3, in which the three columns refer to the 
three layers of knowledge that | like to distinguish between 
and which will be explained shortly. 


Hin a symbolic — if not ironic — sense, you could call this a ‘path 
integral approach to the understanding of quantum theory. 
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Figure 3: Three layers. In quantum physics one may distinguish three layers displayed here as columns. From left to right, (A) is 
about the phenomenology of systems in which quantum theory manifests itself, (6) is the layer of mathematical representations or 
realizations and (C) is the layer of quantum concepts and principles. Note that the layers are coupled together as a whole, not via 
their individual components. 


PHYSICS, MATHEMATICS AND CONCEPTS 


xxiii 


Theoretical physics is basically about constructing optimal 
mathematical models of reality. It usually starts by effec- 
tively describing certain regularities apparent in some ob- 
served physical phenomena. The next step — if possible — 
is to relate different phenomena through the model. This 
amounts to reducing the number of independent param- 
eters in the models. Finally, one hopes that it will make 
predictions and suggestions as to where to look for unique 
signatures of new phenomena. Over time this modelling 
has been done in an ever more sophisticated way, exploit- 
ing existing as well as developing new mathematical and 
computational tools. 


A first step in modelling a physical system is to just identify 
which degrees of freedom are relevant to the phenomena 
one wants to study and understand. A second crucial step 
is to identify what the relevant interactions between these 
basic degrees of freedom are. For the moment these are 
just words referring to basic notions, which have to find 
their way into some symbolic representation or mathemat- 
ical framework. We may, in the end, have to extend our set 
of basic concepts and rules, our grammar so to speak, in 
order to accommodate new phenomena and new underly- 
ing principles. 


In the development of quantum theory over the past cen- 
tury, this is exactly what happened. It turned out that we 
needed new mathematical realizations and ever more so- 
phisticated representations of the material world. It is a 
multitude of unfolding insights intertwined with the dramatic 
growth of our experimental means to probe physical real- 
ity that marked the advances in theory over the last cen- 
tury. And finally, once the mathematical, maybe somewhat 
pragmatic modeling has advanced sufficiently, one should 
try to come to a more fundamental insight as to what these 
models imply for the logical structure of the underlying 
physical reality. Here we enter a realm with philosophical 
ramifications, where we move from the syntax anchored in 
the mathematical consistency of the model, to its seman- 
tics and interpretation. We can pose ontological questions 


about what is ‘to be and/or not to be’, as well as questions 
about the epistemology and about what is ’knowable’. We 
enter the territories of beables and knowables: in short, 
the realm of meaning. 


Three layers. Quantum theory at large comprises a huge 
body of knowledge | like to think of as consisting of three 
layers as depicted in the columns of Figure 3. The A-layer 
comprises the physical realizations and manifestations of 
quantum matter, the -layer is about mathematical rep- 
resentations and realizations, and finally the C-layer con- 
cerns underlying concepts and principles, and their logical 
structure and interpretation. Indeed it is only after one has 
a mathematically consistent formulation of the theory that 
conceptual questions force themselves on us in a way that 
we can make sense out of them. Yet, one cannot avoid 
switching between the layers if one is to give a coherent 
account of the subject as a whole. 


The first layer A refers to quantum phenomenology, the 
body of observational evidence concerning the broad spec- 
trum of quantum phenomena that we will consider in this 
book. It is in fact the same as the first column in Figure 2, 
but note that the other columns of the two Figures refer to 
qualitatively entirely different things. 


The second layer B refers to mathematical representations 
or models. This is already more abstract, as we ascend to 
the mathematical modeling of the observed phenomena. 
One might for example think of quantum states being el- 
ements of some vector space referred to as the Hilbert 
space, or of the mathematics of a qubit, or of a wave func- 
tion. Or consider physical observables as represented by 
operators that act on that Hilbert space, like matrices or 
differential operators. And we may think of the dynamics 
of the quantum system described by famous differential 
equations, such as the Schrödinger, Heisenberg and Dirac 
equations. 


And indeed, in the middle column from bottom to top we 
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see increasingly complex realizations of the same quan- 
tum principles, which are stated in the first step at the bot- 
tom. It is a hierarchy of degrees of freedom. We start 
with the discrete case of qubits and ‘qubit mechanics’, and 
move one step up to the simple continuous case of a single 
quantum particle. In the next step we face the problems of 
many particles of one single type or species, and the in- 
teractions between these species, which leads us to the 
theory of quantum fields. This level includes multi-particle 
states, and the creation and annihilation of particles; fur- 
thermore the forces are included and quantized. We finally 
end up with theories (and so far only theories) that attempt 
to combine all types of fields (or particle species) in the 
spectrum of a unique quantum (super)string. At this level 
space-time is included and quantized. So what we have 
indicated in the second column is the idea that states rep- 
resenting the physics at one given level form a small sub- 
space of the set of states in the next step. It represents a 
modelling hierarchy. 


We have mapped the system onto a mathematical model 
that allows us to make calculations and predictions, but 
models also pose new challenges for finding out what the 
essential concepts are that underlie all those quantum phe- 
nomena. We like to understand what the generic features 
are that set the quantum world apart from what we were 
used to in classical physics. That is what the next layer is 
about. 


The third layer C is concerned with the conceptual impli- 
cations of the mathematical framework, where we are re- 
quired to interpret the basic mathematical entities back into 
physical terms. You may compare this to coming home 
from an exciting journey to some unknown country, and 
being forced to describe to your colleagues what the ex- 
quisite, extremely exotic food tasted like. You may think 
of mathematical models that manage to successfully de- 
scribe and predict measurement outcomes, but at the same 
time force us to reinterpret what the very nature of physical 
reality is. There is the saying cherished by many theorists 


that ‘equations speak for themselves’, but that is often not 
the case. For example, you may know that the mathemat- 
ics of special relativity is surprisingly simple, but its phys- 
ical ramifications are certainly not; it forced us to funda- 
mentally redefine our concepts of space and time. Some- 
thing similar happened in the realm of quantum theory with 
respect to the true nature of what we, for convenience, call 
‘matter’, or ‘radiation’, or ‘energy’, or ‘information’. Here 
we encounter the necessary consequences of the Hilbert 
space formalism, such as the existence of quantum en- 
tanglement and quantum interference. And we have to 
cope with non-commuting observables leading to funda- 
mental uncertainties in measurement outcomes. These 
unambiguous consequences of the mathematical formal- 
ism, which by itself is clear cut, will, as we will show, pose 
quite formidable epistemological and philosophical ques- 
tions. It suffices to refer to the infamous Einstein, Podolsky, 
Rosen (EPR) paradox, which lies at the heart of the well- 
known Einstein—Bohr debate about how quantum theory 
defines what we call reality. This debate has been going 
on for three quarters of a century and only now appears 
about to be settled. 


Going from left to right in Figure 3 is, in some sense, a per- 
spective marked by experimental discoveries and as such 
a rather historical perspective. Going from left to right is 
therefore hard because it is erratic, and it moves slowly ex- 
cept for sudden jumps. It is highly unpredictable because 
it basically lacks internal logic: there is no strictly logical 
path from classical to quantum physics. The path from left 
to right is the historic one, and therefore bumpy, but also 
paved with would-be miracles and intriguing misconcep- 
tions, which indeed make a wonderful narrative with ample 
heroism and drama. 


But once the subject has matured, there is the other pos- 
sibility, namely to start on the right with the concepts and 
a logical, abstract framework, and from there move back 
to the left. A theorist like myself naturally prefers a pre- 
sentation from right to left, which in a sense is highly anti- 
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chronological, but would be more comprehensive because 
it has an internal logic and systematics. | believe that 
once things are understood, going from right to left is easy. 
Moreover, it would give the author the freedom to limit 
himself to the quantessence of a coherent body of knowl- 
edge. 


Yet, in spite of this argument, it would be a bad idea to 
really treat the three layers sequentially from right to left, 
because you need the stuff on the left to appreciate the 
content of the right column. This suggests the option of a 
left-right compromise, or left-right coalition, just like that is 
often the case in the politics of healthy democracies. 


Combining parts and layers: the outline. After some 
reflections on the general structure of the book, let me 
now just give a more detailed description of the layout of 
the chapters. As mentioned, | have divided the book in 
three parts or volumes, as indicated in Figure 2. Volumes 
| and IIl are primarily descriptive and do not require much 
math, since they are phenomenologically-oriented. So in 
the context of the layers, Volumes | and Ill mainly deals 
with A with some attention to layer B. Volume Il, with a 
title that refers to the ‘quantessence’, focusses more on 
the mathematical and conceptual structure of the theory, 
and covers the layers 6 and C. As a matter of fact, quan- 
tum lovers with an outspoken fear of formulas may prefer to 
read only Volumes | and III as a single coherent descriptive 
account of what quantum theory has achieved. The follow- 
ing preview may help you to make up your mind. 


Volume | talks about The journey, where we follow a path 
starting at the level of atoms, and descending deeper into 
matter to the worlds of nuclei and elementary particles and 
their interactions. This part is so to speak inward bound. 
But before we embark on this descent in Chapter 1.4, we 
give a review of what classical physics is about in Chap- 
terl.1. Chapter 1.2 deals with the very breakdown of clas- 
sical physics, from which crises the theories of relativity 
and quantum emerged. Here we also included a section 


on the physics of geometry and a section on the notions of 
information and computation, highlighting another funda- 
mental turning point in twentieth century science and tech- 
nology. In Chapter 1.3, on units, scales and universal con- 
stants, we obtain surprisingly deep insights in the domains 
of validity of our cherished theories by applying what we 
call ‘dimensional analysis. It provides us with a heuristic 
quantitative sense of what the characteristic scales in na- 
ture are, and why. In Chapter 1.4 we describe the quest for 
the basic building blocks of matter all the way from atoms 
down to the most fundamental constituents of matter and 
radiation. 


In Volume II — called Quantessence: how quantum theory 
works — we give an accessible introduction to the mathe- 
matical modelling tools and representations that comprise 
quantum theory, including those which led to a number of 
remarkable conceptual and semantic puzzles. This part 
emphasizes the two deeper layers | alluded to before, i.e. 
the layers 6 and C of Figure 3. 


This second part also leads us deeper into the subjects 
of quantum information and computing. To that end we 
first have to contrast the setting of quantum theory with 
its classical counterpart. In Chapter ll.1, the first of Vol- 
ume Il, we start by introducing quantum states as vec- 
tors in Hilbert space. | discuss the structure of Hilbert 
space for qubits and quantum information in quite a lot 
of detail. In the second Chapter II.2, | discuss the quant- 
essence of observables, why we think of them as opera- 
tors acting on Hilbert space, and what it means to make 
a quantum measurement. In this chapter the Heisenberg 
uncertainty relations are also introduced. In Chapter II.3 | 
talk about the measurement process more extensively with 
a particular focus on quantum interference phenomena. 
Chapter Il.4 examines quantum entanglement and some 
of the modern experiments addressing the profound ques- 
tions of cloning, Schrédinger’s cat, hidden variables, as 
well as quantum teleportation and computation. In Chap- 
ter ll.5 | explain the concepts of quantum particles, fields 
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and strings. There, the famous equations of Schrödinger, 
Heisenberg and Dirac that describe the time evolution of 
states and observables, will be introduced. | also explain 
properties like quantum spin, quantum statistics and their 
relationships. In Chapter 11.6, the final chapter of Volume 
Il, we introduce the notions of symmetry and symmetry 
breaking which play a central role in all of modern phys- 
ics. The notion of symmetry served as a powerful guiding 
principle in our quest to understand nature. 


In the third and final Volume of the series we return to 
model physical reality but we now move upwards from the 
atomic scale. In Chapter Ill.1 we discuss how matter se- 
quentially evolved in the very early universe, from quarks, 
to nucleons, to atoms, and from simple molecules to the 
basic (bio)chemistry concerning the molecules of life. 


Chapter Ill.2 and III.3 are devoted to the splendid diversity 
of quantum phenomena in the physics of many body sys- 
tems that are manifest in gaseous, liquid as well as solid 
phases. Where in Chapter IIl.2 we consider the atomic and 
nuclear lattices and to what extent these are ordered, the 
focus in chapter III.3 is on the electronic behavior in solids 
and the quantum phenomena they display. 


In Chapter IIl.4, we touch upon the quite advanced notions 
of scale dependence and renormalization. Part III could 
well be called outward bound, certainly if reasoned from 
the atomic scale where quantum theory made its first ap- 
pearance. The criteria of inward and outward bound refer 
to the arrows in the left column of Figure 2. 


In the concluding Chapter III.5, we zoom out and look at 
the meaning and impact of quantum in the broader context 
of science, technology and society. 


After the concluding chapter you find a set of Math Ex- 
cursions, appendices in which we offer rather minimal but 
tailor-made introductions to the mathematics used through- 
out the book. 


Choosing the structure of a three-volume book means that 
we couple together the layers of Figure 3 so as to enable 
a coherent presentation of the quantessence as a whole, 
which is accessible without being too superficial. What you 
see is that quantum theory, even when you restrict yourself 
to the quantessence, is a huge field. and that is why | 
divided the book up in three volumes. 


As you may be aware, an impressive number of Nobel 
prizes have been awarded in the course of the past cen- 
tury to quantum discoveries in physics and chemistry. We 
list most of them in an appendix (on page 644) at the end 
of the final volume. There we also provide some of the 
chronology, and list the names of many of the influential 
thinkers who made seminal contributions to the field. It 
may also help you to follow up specific topics that have 
caught your interest while reading. 


| like to think of the three parts of the book as a kind of a 
triptych, where the central panel covers the deeper quan- 
tum scenery, while the side panels are more descriptive 
and discuss lots of real physics, from quarks all the way up 
to bio-chemistry and the splendid diversity we encounter in 
the condensed states of matter. 
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We start this volume by a brief explanation of the core results of classical physics and how observations necessarily led 
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The gems of classical physics 
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Mission almost completed 


Don't laugh! There is a special section in purgatory 
for professors of quantum theory, where they will 
be obliged to listen to lectures on classical physics 
ten hours every day.’ 

Paul Ehrenfest (in a 1927 note to Einstein) 


In this chapter we recall the great theories of classical 
physics with an emphasis on their underlying principles. 
You could call them ‘classessential.’ One by one they rep- 
resented turning points in our understanding of Nature. 
First there are the four fundamental laws of Newtonian 
mechanics and gravity, which open the door tto the pre- 
cise description and modeling of general dynamical sys- 
tems through systems of differential equations. The sec- 


ond pillar corresponds to Maxwell’s four laws of electro- 
magnetism, describing all classical electromagnetic phe- 
nomena including those of light and radiation. The third 
component consists of the macroscopic laws of thermody- 
namics and how they are explained from the underlying mi- 
croscopic dynamics by the theory of statistical mechanics. 
It is here that the notion of entropy emerges as a measure 
of information or disorder. 


Towards the end of the nineteenth century, physics almost 
came to an end. The physics community seemed opti- 
mistic and self-confident. Most of the observed natural 
phenomena appeared to be accounted for in the estab- 
lished framework of classical physics, at least in principle. 
It was hard to imagine what else could present itself to 
their curious minds. And indeed, after many centuries of 
scientific research, there had been not only highly impres- 
sive technical achievements, but fundamental laws of na- 
ture had also been established. 


| like to warn agains the sloppy use of terminology when 
it comes to words like ‘models, ‘theories’ and even ‘laws 
of nature.’ It is rule rather than exception among scientists 
to treat these concepts as basically interchangeable. We 
refer to the theories of Newton and Einstein, by which we 
mean the mathematical description of certain theoretical 
models of nature’s behavior, which in fact have been so 
successful that they are often also referred to as ‘laws of 
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nature.’ Yet they are not true in any absolute sense, there 
is no guarantee that we will not one day find that nature 
violates such a law of nature in a domain of reality that we 
cannot yet observe. So, on the one hand, one might as 
well put these cherished ‘laws’ in the category of ‘working 
hypotheses’ in view of the fact that we can never fully ex- 
clude the possibility that they are conceivably false. On 
the other hand, the ‘laws’ have proven to be remarkable 
robust quantitative statements on the workings of nature 
that have survived centuries of ever more extensive (and 
expensive) experimental tests. In that sense they express 
some of the core messages carried by nature about our 
world, about what and who we are, and how things ended 
up this way. They may not tell us why we are here but at 
least how we got here. It appears that modern science in 
many ways liberates us from the narrow anthropocentric 
views that are as dominant as they are questionable in the 
debate of what the place and future of humankind in this 
universe may be. 


When | talk about the breakdown of classical physics, | 
refer precisely to the type of breakdown where the de- 
clared universality of laws turned out to primarily express 
our overconfidence. The term breakdown here is not as 
much a matter of whether a theory is right or wrong, but 
rather marks the limited domain of validity of any particular 
theory. In any pragmatic sense there is nothing wrong with 
classical physics as long as you apply it to problems within 
its domain of validity. You may compare it to the situation in 
biological evolution where it is evident that we have passed 
beyond the stage of bacteria, but that doesn’t stop them 
from being around and still playing a crucial role. 


What the notion of classical physics refers to depends on 
the context in which it is discussed. Often ’classical’ is 
contrasted with ‘quantum’, and in that case we can con- 
sider the theory of relativity to be part of classical physics. 
We could however also contrast ‘classical’ with ‘modern’, 
and in that case we can draw the line at the end of the 
nineteenth century and count both relativity and quantum 


Figure |.1.2: Newtonia: Composition with bound orbits. (Image 
constructed using visualization & graphics tools of the Mathemat- 
ica package.) 


theory as parts of ‘modern’ physics. It is this latter dis- 
tinction that we will make in this chapter. The use of the 
word ‘modern’ will strike you remarkably inappropriate be- 
cause this ‘modern’ physics was to a large extent formu- 
lated a century ago; ‘modern’ in this context clearly does 
not mean contemporary. Modern theory in this context ap- 
parently just means that we have not yet encountered the 
limits of its domain of validity. In this chapter we start by 
briefly recalling the core messages of the classical theo- 
ries of mechanics and the gravitational force, of the theory 
of electromagnetism and light, and of the theories of ther- 
modynamics and statistical physics. 

In the next chapter we briefly summarize how certain crises 
in classical physics seeded two fundamental turning points 
in our thinking about nature: relativity and quantum phys- 
ics. In that chapter we also introduce the basic concepts 
of information theory, as this branch of science is now also 
heading towards a quantum revolution. 

In the third chapter we delve deeper into the notion of the 
domain of validity of a model and discuss how the partic- 
ular values of the universal constants that appear as pa- 
rameters in physical models basically set the scale of our 
universe. 


NEWTONIAN MECHANICS AND GRAVITY 


The fourth chapter gives an account of our progressive in- 
sights in what the basic building blocks of nature are, from 
the atomic level all the way down to superstrings. 


Newtonian mechanics and gravity 


Newton’s work lead to the unified description of terrestrial 
and heavenly mechanics and involved the creation of the 
mathematics of change, called differential calculus, which 
in turn gave rise to the birth of the general theory of dy- 
namical systems. 


Four laws only 


Back to the achievements of classical physics. Firstly there 
are Newton’s four laws described in his genial Principia 
Mathematica published in 1667. Three of those laws con- 
stituted the foundations of mechanics: (i) the law of inertia, 
(ii) the force law and (iii) the the law of action and reaction. 
The fourth law is the law that defines the gravitational force 
between two masses. 


The first law: the law of inertia. The law of inertia postu- 
lates that if a body is at rest or moving at a constant speed 
in a straight line, it will remain at rest or keep moving in a 
straight line at constant speed unless it is acted upon by a 
force. This property is called inertia. We have illustrated 
the distance traveled x(t) for a body of some mass m, for 
two constant velocities vı < v2 in Figurel.1.3. In the ab- 
sence of a force the distance traveled is proportional to the 
elapsed time, in other words: x(t) = vt.! The first law led 


'We adopt the notational convention where symbols denoting 
vector-like quantities like position, velocity, momentum and force are 
given in bold except when we are dealing with one spatial dimension. 
For the length and the length squared of a vector we write |v| = v and 
v-v = |v’ = w’. Scalar quantities like mass are set in the default 
typeface. 


Ay 
x(t) 


Figure |.1.3: Newton's first law. In the absence of a force a body 
will move at a constant velocity and momentum. In the figure the 
distance traveled as a function of time x(t) for a body of mass is 
plotted for two constant momenta pı and p2, corresponding to 
the two arrows. 


to the fundamental notion of momentum, where the mo- 
mentum p of an object is defined as the product of its mass 
m and its velocity, p = mv. This linear relation between 
momentum and velocity is depicted in Figure 1.1.4 , where 
the slope of the line by definition equals the mass. Momen- 
tum is also referred to as the ‘amount of motion, and if you 
don’t have a feeling for it, think of it as impact. If somebody 
throws a large brick to you the impact will be much larger 
than when that same person would have thrown a piece of 
foam of the same shape with the same velocity. The first 
law states that in the absence of a net force on an object 
its momentum will not change. Zero force means that mo- 
mentum is conserved, and this implies that the velocity is 
constant. 


The second law: the force law. The second law, called 
the force law, is the well-known relation between the force 
F applied to a body, and the resulting acceleration a , given 
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Definition of momentum. Newton defined 


Figure 1.1.4: 
momentum as the ‘quantity of motion’ directly proportional to 
the velocity of the object, the proportionality constant equals the 
mass m of the object. 


by the fomula F = ma.’ As acceleration is the rate of 
change in velocity, the force is then equal to the rate of 
change in momentum. A brilliant aspect of this equation is 
that at first glance it doesn’t seem to hold. | remember as 
a kid pulling other kids on a sled through the snow: yes | 
had to pull to get the sled moving but if | stopped pulling 
it did not keep moving with constant velocity as | thought 
should be concluded from the law. No force, no change in 
momentum. But the sled immediately came to a halt after 
| stopped pulling it. | had to conclude that there should 
be another force in action, and indeed there was, it was 
the resistive force of the snow. Now that is a funny force 
that opposes motion, the greater the velocity, the greater 
the force in the opposite direction. It is as subtle as the 
workings of the opposition in parliament. But postulating 


? Actually it should be written as F = dp/dt, where strictly speaking 
there is an extra contribution because dp/dt = vdm/dt + mdv/dt. 
The first term proportional to the change in mass is considered to be 
zero because for a single particle one assumes a constant mass. But 
for a rocket burning its fuel this is no longer true. 


n ate an orbit x(t) 
C> 


Figure 1.1.5: Newton’s second law. We have drawn a segment 
of the orbit in 2 dimensions of a particle with mass m under a 
constant force F in the x-direction. This could be a particle with 
charge q in a constant electric field E exerting a force F = qE. 
Note that the momentum p at time t points along the slope of the 
particle’s trajectory x(t). Because the force F points in the x1- 
direction, only the component p1- will increase while p2 remains 
constant. 


a force with such a subtle adaptive power, is that not just 
postulating what you see, postulating the facts you wished 
to explain? Well don’t put the book aside yet, there is more 
to come. 


The third law: action is reaction. A simple example of 
the law of ‘action is reaction’ is provided by a book at rest 
on a table as depicted on the left in Figure 1.1.6. Gravity 
pulls the book down (light blue arrow) attached to centre 
of mass pointing down), it equals the force of the book on 
the table (dark blue arrow pointing down) and indeed, the 
book would fall down were it not for the table exerting an 
equal but upward directed normal force on the book It is 
this balance of forces that act on an object, which is the 
main topic of statics. It means that the net force, but also 
the net torque, on an object should be zero and that does 
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Figure 1.1.6: Newton's third law. The third law Action = Re- 
action applies to a chicken at rest on a table. The downward 
gravitational force can be represented by the large light blue ar- 
row attached to its centre of mass. Through its legs it exerts in 
two places a force on the table, and the table exerts a reaction 
force exactly equal and opposite at the points of contact. The 
net force, which is the sum of the light and dark blue arrows on 
the chicken, is zero and its change in momentum will be zero so 
it doesn’t move. But why are the forces of the two legs unequal? 
That is to make sure that the chicken doesn’t fall over sideways. 
This requires that the torque on the chicken has to be zero as 
well, so that its angular momentum does not change. 


not only explain the stability of architectural structures like 
bridges, arches or cathedrals, but also the stability of the 
chicken at rest on the table at the right-hand side of the 
figure. 


A more subtle example of the third law is provided by a 
game called ‘arm wrestling.’ Two individuals (still mostly 
men) sit at opposite sides of the table and fix their elbows 
on the table and try to push each other’s hand towards the 
table. For quite some time nothing seems to happen, in 
spite of the fact that both individuals do their utmost best 
to get the fists moving. As long as nothing happens the net 
forces of the hands are in perfect balance, a situation that 


Figure 1.1.7: Newton's third law. The third law Action = Reac- 
tion applies to arm wrestling, also when the balance of power 
is broken and the ‘red’ force is larger than the ‘grey’ force. An 
explanation is given in the text. 


is called a static equilibrium. This lasts until the balance is 
broken, leading to a net force in one direction causing both 
fists to start moving, until one hand is forced on the table 
and somebody has to order a round of beer. 


The question you may wrestle with is whether in such a 
dynamic situation the action-is-reaction-law still holds. So, 
let us look more closely at how to apply the fundamen- 
tal action-is-reaction-law in such a dynamic setting. This 
is explained in Figure 1.1.7, where we give a schematic of 
the forces involved. We identify three different instances 
where the third law can be applied. Firstly, on the left side 
we have the force of the red arm r on the ‘red’ hand R (de- 
noted by Fyr), which indeed equals the opposite force of 
the red hand on the red arm (Fr). In the middle we have 
the force of the red hand on the grey hand (Fre), and the 
force of the grey on the red hand (Fer), these have to be 
equal because of the third law applied at the interface be- 
tween the hands. On the right side we have the force of the 
grey arm (Fic) on the grey hand and the equal and oppo- 
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Figure 1.1.8: Newton's fourth law. This is Newton’s famous 
‘inverse square law’ for the gravitational force between two mas- 
sive objects as a function of their distance r . The force only has 
a radial component, which is negative meaning that the force is 
attractive. 


site force (Fgı). So we see that the third law should be ap- 
plied three times referring to three different forces. If both 
hands move with an acceleration a we can firstly apply the 
force law to the red hand telling us that the net force on it 
is Fr —Fer = ma, applying it to the grey hand it yields 
Freg—Fig = Ma. Next we use the result that Free = Fer 
to ascertain that the net force Ferr — Fig = (m+M)a, 
which is the force law applied to the system of both hands. 
This argument shows that the hands can be in accelerated 
motion, not in spite of, but rather thanks to the fact that the 
law of ‘action is reaction’ remains valid all along. It illus- 
trates the important fact that ‘action is reaction’ is a gen- 
eral law, that is applicable as long as the objects exerting 
force on each other stay in contact. 


The fourth law: the law of gravitation. Newton’s fourth 
law is his celebrated universal law of gravitation, 
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Figure 1.1.9: Conic sections. The general solution for the orbits 
of a planetary object around a star can be obtained by inserting 
the gravitational force in the second law. The resulting orbits 
correspond to the conic sections depicted above. 


expressing the attractive gravitational force between two 
masses m; and mz as proportional to the inverse square 
of their distance r . The force as a function of the distance, 
is depicted in Figure 1.1.8. Note that here the principle of 
‘action is reaction’ is indeed respected implicitly, because 
it is the force ‘between’ two objects, they experience an 
equal force in opposite directions. Indeed it is so universal 
that it applies with the same constant Gy equally well to a 
pencil dropping on the floor (the earth) as to the motions in 
the solar system or to the motion of stars in the Milky Way. 
It was justly said that Newton with this law unified celestial 
and terrestrial mechanics. Substituting this gravitational 
force in the second law, one can solve the system for gen- 
eral planetary orbits around a star. They correspond to the 
well-known conic sections depicted in Figure 1.1.9, where 
the top two are the bound circular or elliptic orbits, and the 
bottom two are the unbound parabolic and hyperbolic or- 
bits. 
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Dynamical systems 


Let me say a little more on how these laws of Newto- 
nian mechanics furnished a first and powerful description 
of what nowadays is called a dynamical system, a sys- 
tem described by a set of variables whose values change 
over time. Thinking about mechanics that way, one would 
rewrite the laws in a different way that illuminates the dy- 
namical system’s perspective. 


Phase space. First we say that if we look at a particle as a 
system, then it has at any instant in time a state that is la- 
beled by two variables, its position x and its momentum p . 
So, we may think of the state of the system as correspond- 
ing to a point in (x, p)-space. This space is usually called 
the phase space Ppn of the system. For a particle mov- 
ing in ordinary space Ppn is six-dimensional, because we 
have to specify the three components of its position and 
the three components of its momentum. The dynamics of 
the system can be envisaged as a trajectory (x(t), p(t)) of 
the point that represents the state of the system, through 
Ppn- This trajectory is then specified by giving the rule 
which tells you where the system goes if you give the point 
at some initial time to . 


Differential equations. This rule is like an incremental 
prescription, it specifies an infinitesimal change by using 
the notion of a (time) derivative (d/dt) as a measure of 
change: 


d Something 


E { change of that ‘Something’ 
dt to 


per unit time at t = to. 


Equations involving this (time) differential are called dif- 
ferential equations, to contrast them with algebraic equa- 
tions — like the quadratic equation ax? + bx + c = 0- 
in which algebraic expressions in the variables appear but 
no derivatives. If the equations involve time derivatives, we 
speak of the equations of motion. lf the system is closed, 
the change will depend only on the state of the system at 
earlier times. In the quite common case that the system 
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Figure 1.1.10: Dynamical system. We display the vector field 
corresponding to a particular example of (1.1.2). This means that 
in each point (y1,y2) we plot the vector (arrow) with compo- 
nents dyi/dt = —y2 + cos2y; and dy2/dt = yı + sin2y2. 
Solutions of the system correspond to trajectories that start from 
a given point in (y1, y2)-space, following the arrows. In this case 
we give some trajectories starting on the yı axis that converge 
either to the blue or the red limit cycle. 


has no memory - like the system of the sun and the earth 
in the Newtonian picture — the change at some particular 
time t only depends on the state of the system at time t. It 
is generally agreed upon that sun and earth do not wrestle 
with sleepless nights caused by bad memories. So the dy- 
namical system with a set of N independent variables {y;} 
with i = 1,..., N would look like a set of N coupled equa- 
tions describing the change of the system by specifying 
the N components f; of the change vector, each of which 
may in turn depend on the set of all variables {yj}: 


dyi 
dt 


= fi({yj}). (1.1.2) 


The functions fi(yj) encode the interactions between the 
different variables, In other words, these variables include 
their mutual dependence and of course a number of exter- 
nal parameters which typically appear as the coefficients 
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of the terms specifying the interactions. The f;,(y;) cor- 
respond to the components of the ‘change vector’ at any 
point in phase space, which means that they define a vec- 
tor field over the configuration or ({yi}) space. This vec- 
tor field forms a powerful mathematical representation of 
the dynamical system as a whole. It depicts the phase 
space as a fluid flow. If we drop autumn leaves into the 
flow, they will start to move, following the particular flow 
lines which correspond to the particular solutions of the 
dynamical system. We have depicted a particular vector 
field in Figure 1.1.10 , which also shows two sets of tra- 
jectories described by solutions of the dynamical system 
for different starting points on the y-axis. The trajectories 
are obtained by locally following the direction of the vector 
field. The dictum is indeed: ‘go with the flow. The orbits 
are seen to converge on one of two different closed limit 
cycles. 


Yet another way to look at a dynamical system is that it 
represents an algorithm that takes input information, the 
vector defining the initial point in phase space and moves 
or ‘processes’ it, to some final state. 


Writing Newtonian mechanics in this format the first and 
second laws look like, 


dx 

‘dt = p/m, 

dp 

— «= F 1.1. 
Ax F (1.1.3) 


They completely specify the motion of the point in phase 
space, where the force F = F(x, p) may in general depend 
on the position and velocity of the particle. It is custom- 
ary to treat the earth-sun system by keeping the sun fixed 
in the origin (a good approximation because the sun has 
a huge mass) and let the earth move through the grav- 
itational force (the fourth law) that only depends on the 
length of the position vector r = |x|. The third law is ba- 
sically a constraint on the system: if we had included the 
position and momentum of the sun as independent vari- 
ables, then the third law would require that the same force 


F would appear with the opposite sign in the equations for 
the sun and for the earth respectively. From this example it 
is also clear that the functions on the right-hand side of the 
equations do not only depend on the variables, but also on 
certain parameters that set the strength of the couplings 
or interactions. These parameters, like the masses or the 
Newton’s gravitational constant, are supposed to be con- 
stant but must of course be varied to find the best fit to the 
experimental data. They are the input parameters of the 
model. It is here that Occam's razor — the principle of ra- 
tional minimalism — applies, decreeing that if two models 
perform equally well, the one with the fewest parameters 
is to be preferred. 


Conservation laws I 


The tears of the world are a constant quantity. For 

each one who begins to weep, somewhere else 

someone stops. The same is true for laugh. 
Samuel Becket — Waiting for Godot 


Note that with the dynamical laws for the fundamental vari- 
ables, one can also calculate the time evolution of other 
(x— and p—dependent) dynamical variables. One such 
variable is the energy, often called the Hamiltonian and 
denoted as H. It should be thought of as a function H = 
H(x, p) of the basic state variables x and p . Another such 
variable is the angular momentum L = xxp, whichis basi- 
cally the amount of rotational motion, or the rotational mo- 
mentum. We will return to these quantities shortly. 


Under certain circumstances it may happen that some dy- 
namical variables are conserved, meaning that they do not 
change over time. These are often called constants of the 
motion. For example in Newtonian mechanics, if there 
is no force, that is we have that F = 0, then the equa- 
tion (1.1.3) tells us immediately that the momentum does 
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not change, it stays constant and is thus ‘conserved.’ On 
the other hand, if the force only depends on the distance 
and not on the direction (as is the case in Newton’s grav- 
itational force), then the angular momentum will be con- 
served as we will explain shortly. So, if the time derivative 
of some physical quantity Q equals zero: 


aQ _ 


9 (1.1.4) 


we call the equation a conservation law for Q , because 
the amount of Q is constant in time. 


Energy conservation. Of special interest is the case of 
energy conservation because it is of general validity and 
applies to basically all observed processes in nature that 
are physically based. Let us for convenience restrict our- 
selves to a (one-dimensional) situation which is simple but 
also surprisingly common, where the energy H consists of 
two parts, a kinetic energy part U(p) which only depends 
on the momentum, and a potential energy part V(x) that 
only depends on position: 


H(x, p) = U(p) + V(x). (1.1.5) 


Then its time derivative can be calculated:° 


dH dU dV dUdp  dVdx du 
dt dt dt dpdt dxdt dp 


p dV 
m dx ` 


We see that the energy will be conserved if the terms on 
the right-hand side cancel each other. This requires that 
the following equalities have to hold: 


u 
S =p/m, (1.1.6a) 
dV 
— =F. 1.1.6b 
Ay (1.1.6b) 


3The second equal sign involves the use of a mathematical identity 
called the chain rule which says that if U(p) depends on t only through 
its dependence on p(t) , the time change can be found by first calculat- 
ing the change in U because of a change in momentum, multiplied by 
the change in time of the momentum. It roughly means that one may 
cross out the dp factors of the numerator and denominator. 


The first condition leads to the well-known expression U = 
p*/2m while the second restricts the force in that it has to 
be equal to minus the spatial change of some potential en- 
ergy function V . Such a force field is not surprisingly called 
conservative, exactly because its action ‘conserves’ the to- 
tal energy of the system. Whereas a ‘conservative force’ is 
standard physics jargon, | have never come across terms 
such as ‘liberal’ or ‘progressive’ forces, though if we get to 
the strong nuclear force, other evocative terms will surface, 
like ‘asymptotic freedom’ and ‘infrared slavery. 


Applying a (net) force means doing work. If we apply 
a force F to a mass m, the mass will accelerate and over 
time its kinetic energy will change. If we push a stroller, 
we do work by applying a force on it. If we put a charge in 
an electric field, the charge will start moving because it is 
the field that exerts a force that causes the motion and it is 
the field that does the work. The change in kinetic energy 
AU by definition equals the amount of work AW that the 
force has done. If the force is constant, this means that 
AW = F- Ax. If the mass moves through a conservative 
force field F(x) and it moves along a certain path y from xo 
to x;, we know from conservation of the total energy that 
AE = 0, and thus AU = —AV = V(x) — V(x). The 
amount of work in an arbitrary force field can be expressed 
as the line integral of the force field along a path of motion 
Y : yi 
W= | F(x) - dl, 
Xo 

where the line element dl is the infinitesimal vector tangent 
to the path y in the point x. For a conservative force field 
we get, 


W= -| VV(x) - dx = V(x9) — V(x1), 
Xo 

and we see that the change in potential energy equals the 

difference of the potential energies at the endpoints of the 

path, consistent with the conservation of total energy E. 

The fact that the difference only depends on the endpoints 

means that the increase of energy is not dependent on the 
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Figure 1.1.11: A line integral. In the upper picture we give a two- 
dimensional potential surface V(x) . The force field is defined as 
F(x) = —V V(x). If we choose a path from point xo to x; , we 
can integrate F along that path. This means that we need to 
integrate the component that is tangential to the path. This line 
integral yields the value W = V(xo) — V(x1) which equals the 
work performed by the force, which in this case is negative. We 
had to perform a force to go uphill and therefore the potential 
energy was increased. Note that the outcome is independent of 
the path chosen. 


particular path chosen. If you want to climb to the top of a 
mountain, you can choose between a path that is long and 
not so steep or a very short, very steep path, in either case 
you would have to deliver the same amount of work. 


The harmonic oscillator. A simple example of a conser- 
vative force is the one-dimensional elastic force, applied 
to a mass m hanging on a spring attached to a beam as 
depicted in Figure 1.1.12, 


F=—kx, (1.1.7) 


where x is the deviation of the mass from its equilibrium 
position, k is the elastic constant that characterizes the 
spring and the minus sign indicates that the force the string 
exerts is opposite to the displacement. The force tends to 


Figure 1.1.12: The oscillating mass . A model system con- 
sisting of a mass m attached to a spring. The inset shows the 
oscillatory motion of the mass in configuration (x, t)-space. 


restore the equilibrium state. Because the force increases 
linearly with the distance x, and according to the equa- 
tion (1.1.6) it has to equal minus the derivative of the po- 
tential energy, we may conclude that the corresponding 
potential V satisfying that condition has to grow quadrati- 
cally with x (up to an irrelevant constant term): 


1 
V(x) = 5 kx? A 


We have depicted the energies V, U and H correspond- 
ing to the resulting oscillatory motion in Figure 1.1.13. The 
spring keeps oscillating with a fixed frequency, which is 
equal to \/k/m, a fixed amplitude and a fixed total en- 
ergy H. These motions correspond to the configuration 
space picture of Figurel.1.12, and phase space picture 
of Figurel.1.14. This harmonic oscillator is quite ubiq- 
uitous, because systems are most of the time in equilib- 
rium. And if we perturb such a system, it typically starts 
oscillating around its equilibrium configuration and in real 
cases it usually relaxes back to equilibrium because of fric- 
tional forces. So the harmonic potential is the simplest 
approximation that corresponds to the ‘linear’ response of 


(1.1.8) 
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Figure 1.1.13: The harmonic oscillator. The harmonic potential 
V(x) = —4kx? in red. The equilibrium point is at the origin, 
the resulting force is F = —kx and is always directed towards 
the equilibrium point. If there is no friction, the position x will 
oscillate around the origin with a fixed amplitude and a fixed 
total energy H. 


the system, which should hold as long as the perturbations 
are small. This quadratic potential is crucial and will also 
show up in many different guises at all levels of (quantum) 
mechanics. 


Newton's gravitational potential. The most well-known po- 

tential is the gravitational potential due to a mass M lo- 

cated at the origin in Newton’s theory, defined as: 
_ GNM 


V(r) = - 


(1.1.9) 


where we are now in three dimensions and r denotes the 
length of the position vector r = |x| Note that the potential 
energy is taken to be zero at infinity. The potential en- 
ergy of amass m at a position at a distance r equals V = 
my (r). And it does indeed lead to Newton’s celebrated ‘in- 
verse square’ law (I.1.1). If we let the particle go at some 
position r , it will move radially inward thereby lowering the 
potential energy, but at the same rate increasing its ki- 


Figure |.1.14: Periodic orbits. The phase space vector field cor- 
responding to the harmonic oscillator with m = k = 1 becomes 
(dx/dt, dp/dt) = (p,—x). The orbits correspond to limit cy- 
cles. The origin is a fixed point that coincides with the particle at 
rest. 


netic energy so that the total energy remains the same. 
The conclusion of this part of the story is that if the con- 
ditions (1.1.6) are met, the total energy will be conserved if 
the system evolves according to Newton’s laws. 


Angular momentum. Another important conserved quan- 
tity (in a problem with spherical symmetry) is the angular 
momentum L, which is a vector quantity just like position 
or momentum (velocity) and has three components, each 
of which is conserved. You experience that conservation 
law if you are cycling. If the wheels spin fast, the angular 
momentum vector will be directed perpendicular through 
the axes of the wheels, and the conservation law is re- 
flected in the stability of the bike at high speed. Kids ap- 
parently know about this law because they like to take both 
hands off the handlebars. However, if they slow down, they 
have to be careful to not tripple over sideways, as small ex- 
ternal disturbances may cause a torque that changes the 
angular momentum, breaking the conservation law. 
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Figure 1.1.15: The Like-rule. The defining relation of ‘angular 
momentum, or the ‘amount of rotation’ of a particle in some or- 
bit. It is given by the vector L = x x p where the ‘times’ symbol 
x is called the vector product, which is a well-defined multipli- 
cation rule for three-dimensional vectors. Whether you like it 
or not, the Facebook inspired ‘Like’ symbol on the left symbol- 
izes the ‘Like-rule’ that tells you in which direction the resulting 
L vector is pointing. The instruction is also called the right-hand 
or corkscrew rule and their importance derives from the fact that 
they unambiguously link a direction to a rotation. 


We have illustrated the defining relation L = x x p in Fig- 
urel.1.15. So L is a vector perpendicular to the surface 
spanned by the vectors x and p. Whether it is pointing 
up or down is determined by the right-hand rule, which in 
modern parlance could be better termed the right ‘Like’ or 
‘L rule: point your right-index in the direction of the first 
vector x, bend your fingers in the direction of the second 
vector p , then the resulting vector L will point in the direc- 
tion of your thumb. This rule explains the meaning of the 
vector or cross product or x sign for vectors. The length 
of L is given by the product of lengths of x and p times the 
sine of the angle between them, implying that 


O if x and p parallel 
Ix||p| if x and p perpendicular . 
(1.1.10) 


Ix x p| = |x||p| sin 0 = { 


The vector product of two vectors is a vector that is point- 
ing perpendicular to the plane defined by the two vectors, 
and indeed the product better be zero if the vectors are 
pointing in the same direction, because then they do not 
even define a plane. 


In three dimensions we have two types of products for vec- 
tors. The dot, inner, or scalar product, which maps a pair 
of vectors into a number, a-b = |al/b| cos 9 , and the cross, 
exterior, or vector product which maps a pair of vectors 
into another vector. These definitions may at first sight 
seem contrived, but the opposite is true: all this symbol- 
mumbo-jumbo is mostly there because it offers notational 
convenience, efficiency and transparency. 


This crash course of high school and first-year classical 
mechanics underscores once more that Newton laid the 
foundations of a general approach to dynamical systems 
irrespective of what they precisely describe. The variables 
could refer to either mechanics or to fluid- or electrody- 
namics, but for that matter they could equally well refer to 
ecology or economics. By creating the language and syn- 
tax of dynamical systems, Newton opened a monumental 
gateway into scientific thinking and modelling. Indeed, we 
are standing on the shoulders of giants. E 


IS 


In this addendum we present two alternative ways in which 
classical mechanics can be cast. The reason to do so is 
that these formulations, though more abstract, are relevant 
if we move into the quantum domain. 


Classical mechanics for aficionados 


Canonical (Hamiltonian) structure. Let us first recast 
the setting of classical mechanics in a — what is called — 
Hamiltonian form. lt is just a matter of reformulating the 
same physics in a slightly different but convenient mathe- 
matical form. First we note that from the alternative form of 
the equations (I.1.5-1.1.6), we learn that dU/dp = 0H/dp 
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and dV/dx = 0H/0x.* Now we can write the equations of 
motion (1.1.3) in their Hamiltonian form as 


dx ðH 


— = — 1.1.11 
dp ðH 

A 1.1.11 
dt ox ( B) 


This form of the equations is also called canonical and the 
x and p variables are called canonically conjugate. 


Poisson structure. Having pushed the juggling with deriva- 
tives this far, it pays to go yet one step further and add 
one more element, which will present classical Hamilton- 
ian mechanics in yet another elegant form. This formu- 
lation in terms of Poisson brackets was much preferred 
by Paul Dirac as it brings the classical theory tantalizingly 
close to its quantum descendants. We should first note 
that for an arbitrary function on phase space f(x, p) we can 
derive its time evolution as a first-order dynamical system 
like: 


df ðf dx 
dt ðxdt 


of OH 
dp ox’ 


of dp of 0H 
op dt 


ax Op (1.1.12) 
where we used the equations (1.1.11). Next we may define 
the Poisson bracket of two arbitrary functions f(x, p) and 
g(x, p) by 

of dg 
Ox Op 


0g Of 
Ox ðp ` 


{f, g}pb = (1.1.13) 


It is an expression which is antisymmetric in f and g, as 
{f, ipo = —{9, f}pv . With this definition we can write the 
time derivative of any function on phase space (i.e. any 
dynamical variable) as the Poisson bracket with the Hamil- 


tonian: T 
dt = {f, H}pb . 


We say that the Hamiltonian ‘generates’ the time evolution 
of the dynamical variables. For a conserved quantity Q we 


(1.1.14) 


“We introduce the curly or partial derivatives which mean that for a 
function of several independent variables you only take the derivative 
with respect to one of them (keeping the others fixed). 


have by definition that dQ/dt = 0, which by the equation 
above implies that {Q,H}pp = 0. A trivial instance is the 
case Q = H, where dH/dt = {H, H}pp = O as it should. 
In this way, we may also observe that the equations 


of 


5 (1.1.15) 


of 

={f,P}p» and ap = {f, x}pp , 
hold as well. The first one states that the x derivative, i.e. 
the effect of an infinitesimal translation in x-space on f, 
is ‘generated’ by the momentum p. Finally | should also 
point out the remarkable relation 

{x pho = 1. (1.1.16) 
Variables which satisfy this relation are called canonically 
conjugate. These classical equations involving Poisson 
brackets have striking quantum lookalikes in the form of 
commutators as we will explain in the second Volume of 
the book. 


Lagrangian formulation of mechanics. There is another 
formulation of classical physics that is of great importance, 
particularly if one turns to relativistic systems. When we 
think of simple particle mechanics, the formulation uses 
the coordinate x(t) and the velocity v(t) = dx/dt as dy- 
namical variables. The central quantity now is not the en- 
ergy but rather the Lagrangian L(x, v) defined as: 
Lae) 

L(x, v) = zm — V(x), (1.1.17) 
where we have assumed that the time dependence is fully 
contained in the position and velocity variables. Of partic- 
ular interest is the so-called Action functional S[x(t)] cor- 
reponsing to the time integral of the Lagrangian: 


(1.1.18) 
The action is not just a function of x but a so-called func- 


tional of the function x(t). You may think of the variable 
as being the path taken by a particle that from a position 
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xo = x(to) to some other position x; = x(t). So if you 
give me x(t) for all t then I can calculate v = dx/dt and 
therefore also S, and that is why S is functional of x(t). 
The Newtonian force law of mechanics can now be de- 
rived from a variational argument with respect to the pos- 
sible paths. The variational principle says that the action is 
stationary under a small variation of the path. It is like say- 
ing that the extremum of a function corresponds to points 
where the derivative of that function is zero, indeed at a 
maximum or minimum of a function the slope of that func- 
tion is zero. For the functional the equivalent statement is 
to say that an extremum for the action of a particle to go 
from A to B along a path corresponds to paths for which 
the variation in the action vanishes. So, if we make a local 
change of the path x’(t) = x(t) +8x(t) , then that will lead 
to a change in the action S’(x(t) = S(x’(t)) = S + ôS. 
The requirement that the variation 5S = 0 gives rise to the 
so-called Euler-Lagrange equation(s) which reads: 


ðL ðL 


a red omer (1.1.19) 


One easily verifies that for the particle Lagrangian (1.2.27) 
one obtains Newton’ second law, mdv/dt + dV/dx = 0. 
To go from the Lagrangian to the Hamiltonian formalism 
involves the definition of the generalized or canonical mo- 
mentum p and the Hamiltonian H(p, x)as follows, 


ðL 
ðv 
H(p,x) = pv-L. (1.1.20) 


There are two reasons to introduce the action one is that 
for relativistic systems the action is a Lorentz or relativistic 
invariant quantity while the energy or Hamiltonian is not, 
and the second has to do with quantum mechanics. There 
is a formulation of quantum theory, the so-called path inte- 
gral formalism in which the quantum probability amplitude 
to go from xo to x1 is given by a weighted sum (or integral) 
over all possible paths between the two points, and 


§ A Finding the shortest path. If light or 
3 a photon goes from point A to point 
| B, it presumably follows the shortest 
path and that path is a straight line be- 
tween the two points. However if you use a navi- 
gator in your car, it may ask you to specify whether 
you mean the shortest route in a spatial sense (the 
cheapest) or the shortest route in time (the fastest). 
Knowing that ‘time is money’ this can be a tough 
decision to take. 
A kindergarten model for calculating the fastest 
path is to show kids a chopstick and stick it into 
a bowl filled with water. Hey! Whats happening? 
It looks like the stick is broken! So you pull it out 
again and ‘no’ it is not broken. You hear their brains 
rattling. ‘If there is no water it’s not broken, says 
one. ‘It breaks at the surface, says another. ‘I think 
that the light ray is broken instead, says a girl in the 
back. Bravo! That must be it! 


The answer is given in the figure below. We have a 
landscape with two countries; point A is situated in 
the one with a maximum speed of cı = 120 km/hr 
and point B is in the other where the maximum 
speed is c2 = 140 km/hr. Clearly the straight 
line segment AB is the spatially shortest connec- 
tion. However if we want the path that takes the 
shortest time, we have to make a little calculation. 
We have indicated that the car after a distance s 
crosses the border at a point F, which is at position 
x after which it goes a distance sz in the other coun- 
try. So we choose as our action the time T it takes 
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to get from A to B. From the figure that 


s)=\/ht+x* and s) = /h$+(d—x)? 


Then the calculation of T proceeds as follows: 


Medium 2 : v=c, 


Medium 1 : v=c, 


B B 
T(x) = at = | (at/ds) ds (1.1.21) 


F B 
| (1/ey)ds + | Wte2z\ds (1.1.22) 
A F 
= s1(x)/ci + s2(x)/c2, (1.1.23) 
To find the minimum of T(x) we have to solve for the 
x-value where the derivative of T vanishes: 


dT (x) ll Ue ee 


dx > C] S1 CI S2 


0. (1.1.24) 


We observe that the two quotients correspond to 
the sines of the angles i and r respectively, so that 
the condition implies the simple identity: 
ce oe (1.1.25) 
Ci C2 
which is known as Snell's law for the refraction of 
a light ray at the interface of two media. And that 


brings us back to the deep connection between a 
broken chopstick and Google maps. After the only 
adult in the room had explained all this, the girl in 
the back still had a question: ‘How does the photon 
know which path to choose, as | presume it doesn’t 
know how to take a derivative?’ 


a 


where the statistical weight depends on the action. Some- 
how in quantum theory there are corrections to the clas- 
sical picture, those are contributions that correspond to 
paths that are classically forbidden. Hi E 


Maxwell’s electromagnetism 


It appears to me therefore, that the study of elec- 
tromagnetism in all its extent has now become of 
the first importance as a means of promoting the 
progress of science. 

James Clerk Maxwell, 1873 


The Maxwell equations give a unified description of elec- 
tricity, magnetism and electromagnetic waves such as light 
or radio waves. Electromagnetism introduced the powerful 
concepts of a field and of field dynamics. After we discuss 
some of the familiar electromagnetic phenomena in rela- 
tion to the Maxwell equations, we will introduce the gauge 
potentials which reveal two fundamental symmetries that 
turned out to underlie all of modern physics. The first is the 
so-called Lorentz invariance which lies at the root of spe- 
cial relativity, and the second refers to the notion of gauge 
invariance, a principle that underlies the description of all 
fundamental interactions. 


Besides gravity there are basic natural phenomena of an 
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Figure 1.1.16: Rainbows over Holland. Light and all its optical effects like rainbows are fully described by Maxwell’s equations. Note 
on the right that in the barely visible secondary rainbow the sequence of colors is inverted. This is due to a third reflection of the light 


ray in the vapor droplets (Photo: V. de Vries). 


essentially different nature to be accounted for, those re- 
lated to electricity and magnetism. For these the univer- 
sal laws in their splendid generality were written down in 
a treatise by James Clark Maxwell almost two centuries 
after Newton’s seminal contributions in about 1865. His 
four laws were universal as well, as they accounted for all 
electric and magnetic phenomena observed to that date 
and as a bonus turned out to also describe the propaga- 
tion of electromagnetic waves in its many guises such as 
light, radio waves or X-rays. Maxwell created for us the 
grand synthesis of many of the laws that were proposed 
earlier on by Coulomb, Ampère, Faraday, Lenz and many 
others. And a unified picture emerged of what once were 
considered entirely disconnected phenomena: electricity, 
magnetism and optics. 


Electromagnetic Fields. Maxwell’s theory is formulated 
in terms of a magnetic field B and electric field E, which 
depend on space and time. So at any instant in time at 
any point in space, the fields have a particular strength 
(E(x, t), B(x, t)). You may think of them as two little ar- 
rows (vectors) pointing in some directions in space. The 
Maxwell laws describe in detail how electric currents cause 
magnetic fields, and how changes in magnetic flux result in 
currents which counteract that change. The laws also de- 
scribe how accelerated charges emit electromagnetic ra- 
diation. From a more formal point of view they brought the 
fundamental but rather abstract concept of a field to life, 
in the sense that this concept was promoted from a mere 
mathematical abstraction and calculational tool to a phys- 
ical reality. Electromagnetic fields by themselves propa- 
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gate through space and time as waves and radiation, and 
turned into physical entities carrying energy and momen- 
tum. When you spend a day on the beach and forgot your 
sunscreen, you learn the hard way how much energy the 
electromagnetic waves emitted from the sun can carry. But 
also the beauty of a rainbow on a both sunny and foggy 
day is a manifestation of the electromagnetic interaction of 
light rays with the tiny vapor droplets in the fog. 


The Maxwell equations 


| am going to write down the Maxwell equations in their 
full glory: in other words, in their gory detail. Not to scare 
or impress you but because they are truly iconic. | think 
you need to have seen them, otherwise it is like going to 
Paris for the first time and missing out on the Eiffel tower, 
that would presumably make you mad at your tour opera- 
tor. The comments | will make are rather general and de- 
scriptive which hopefully makes showing them less daunt- 
ing. These are the four equations that could equally well 
be called ‘the four Maxwell laws of electromagnetism and 
light? These equations are usually presented in the follow- 
ing form:° 


V -E=p, (1.1.26a) 
V-B=0, (1.1.26b) 
j 109E 

B=-+4+-— 1.1.2 

V x + Bae? ( 6c) 
10B 

E=—-— 1.1.2 

Vv x oat ( 6d) 


We see that the equations, besides the E and B fields, 
depend on the charge and current densities p(x, t) and 
j(x, t), and on the velocity of light c. That the charges and 
currents appear in these equations is no surprise as they 


5The way they look depends on the precise choice of units, here | 
work in Heaviside-Lorentz units because that choice makes them look 
simpler. The physical parameter is the velocity of light c, and as we will 
see it will pop up in most relations. 


Figure 1.1.17: Coulomb's law. If we put a positive charge at 
rest at the origin, then the first Maxwell equation correspond- 
ing to Coulomb's law will yield an electric field pointing radially 
outward. The strength of the field (given by the length of the vec- 
tor) falls off as 1/r2 in three dimensions. This equation by itself 
describes what is called electro-statics. The second Maxwell 


equation tells you that the magnetic equivalent of such a radial 
field does not exist. 


are the sources of the fields. 

The first equation is often called Coulomb's or Gauss’ law, 
and it determines the electric field that is caused by a 
given charge distribution. It says in particular that a single 
charge causes a radial electric field around it, as illustrated 
in Figure 1.1.17. 

The second equation is the magnetic analogue of the first 
equation for isolated magnetic ‘North’ or ‘South’ charges. 
The right-hand side is put to zero, for the excellent reason 
that magnetic monopoles have never been observed, at 
least up until now. This is the ‘no monopole’ equation, but 
one sees that the system could be adapted to a situation 
where monopoles would show up, a situation that cannot 
be excluded a priori. 

The third equation, also called Ampeére’s law, states that 
a current (or moving charge) causes magnetic fields and 
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Figure 1.1.18: Ampére’s and Faraday's laws. This figure illus- 
trates the two Maxwell equations involving the curl of the fields. 
The left picture refers to Ampére’s law for the case of magneto- 
statics, where a straight current yields an axially symmetric B 
field. The picture on the right depicts Faraday's or Lenz’s law 
describing how a changing magnetic field or flux gives rise to 
an electric field. If we think of the E loop as a closed conduct- 
ing loop, a current would start flowing so as to counteract the 
change in the magnetic field. 


a changing electric field. In other words, given the dis- 
tribution of charges and currents in space and time, the 
Maxwell equations tell you exactly what the electromag- 
netic fields will look like. The third and fourth equation in- 
volve the so-called curl of a magnetic and electric field. In 
Figure |.1.18 we have indicated how the fields indeed ‘curl’ 
around the source which is a vector like the current. It is 
another instance of the ‘Like-rule. 

The fourth equation, also called Faraday’s or Lenz’s law, 
describes how a changing magnetic field causes (induces) 
an electric field, which in turn can give rise a current. If 
you take a conducting loop and you change the magnetic 
flux through that loop, then that change induces a current 
through the loop. If the loop is made of a superconduct- 
ing material, the current will keep running forever. Also 


in this equation we note that a potential ‘magnetic current 
term’ is manifestly absent for the same reason as before. 
It is this absence of magnetic monopoles and currents that 
breaks the would-be symmetry between electric and mag- 
netic phenomena. 

All the magnetic phenomena we have observed up to now 
are understood as caused by currents, meaning moving 
electric charges. Indeed, the second equation tells you 
that there are no magnetic purely radial monopole fields, 
while the third equation tells you that if you make a tiny 
closed current loop, it will act like a tiny magnetic dipole, 
and the overall configuration is a magnetic ‘dipolar’ field. 
You guessed it: all real magnets correspond to zillions of 
microscopic current loops, all neatly lined up. With the 
well-known consequence that if you break a bar magnet 
in half, you do not get a separated North and South pole, 
you just get two smaller dipolar bar magnets. 


Linearity. It is important to observe that the system of 
Maxwell equations is linear in the fields. This means that 
one can simply add different solutions. In other words if | 
have any set of solutions, then any linear combination of 
these would again be a solution. This is illustrated in Fig- 
ure 1.1.19 This linearity of the dynamical system basically 
means that the electromagnetic field does not interact with 
itself. 


Electric-magnetic duality. We have emphasized that the 
asymmetry of the Maxwell equations reflects the asym- 
metry of nature with respect to the existence of electric 
charges and magnetic monopoles. Indeed if we restrict 
the equations to a source-free situation, meaning that p 
and j are zero, then the equations exhibit a manifest sym- 
metry which is referred to as electric-magnetic duality. The 
system of equations is in that case invariant under the 
dual transformation or mapping, where we simultaneously 
make the replacements E — B and B — —E. This map- 
ping transforms the first pair of equations into each other, 
and similarly likewise the second pair. 
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(a) Electric dipole field resulting from two opposite charges. The electric 
field lines are red, and the equipotential lines are blue. 
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(b) Magnetic dipole field caused by a bar magnet. They are made visible 
byputting the magnet on a table and spread some iron filings around it. 


Figure |.1.19: Dipolar fields. If we put two opposite point charges at some distance of each other, the resulting field becomes dipolar, 
meaning that the field lines start at the positive charge (magnetic north) and end on the negative charges. In (a) we have the electric 
dipole field and in (b) we have the magnetic example. The second is approximated by an ordinary dipolar bar magnet. The field 
configuration is because of the linearity obtainedby just adding at every point the two coulomb fields of the single charges as depicted 


in Figure 1.1.17 


Light as an electromagnetic wave. The most impressive 
and surprising achievement of Maxwell was the great dis- 
covery that even in the absence of sources, the equations 
allowed for solutions describing electromagnetic waves that 
propagate through empty space at the velocity of light. 
This explains why the only parameter that appears in these 
equations is the velocity of light. We will return to these 
electromagnetic waves shortly. 


It is gratifying to see how much ‘truth’ about physical real- 
ity can be described with so few symbols. You could say 
that the ultimate elegance of nature is most manifest once 
it is expressed in the powerful language of mathematics. 
Awesome indeed! 


Partial differential equations. The equations form a system 


of partial differential equations, partial because the fields 
depend on space and time variables, and the derivatives 
that appear are with respect to the spatial coordinates as 
well as time. This explains also the appearance of the ‘del’ 
or ‘nabla’ operator V , which is just the ‘vector of spatial 


derivatives, sog 

Y =ar Oa’ Da 
To systematically solve equations involving the vector op- 
erator V , mathematicians have developed a special sub- 
ject called vector calculus. That is what physics students 
have to study and are supposed to master, and as such, it 
is far beyond the scope of this book. You will believe me if 
| say that many shelves in our university libraries are full of 
books and journals that are stuffed with explicit solutions 
of the Maxwell equations for virtually any imaginable situ- 
ation. With all due respect, we will stay far from those im- 


(1.1.27) 
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pressive halls of wisdom, though we discuss some funda- 
mental theorems involving the nabla operator V in a Math 
Excursion at the end of Part IIl on page 621. My narra- 
tive only tries the convey the overall structural aspects of 
the theory, which by the way does not force my story to 
become superficial, in fact quite the contrary. 


A dynamical systems perspective. We may elevate the 
dynamical systems’ pespective of the previous section on 
mechanics to the Maxwell equations and say that the dy- 
namical ‘variables’ are now the components of the E and B 
fields which satisfy certain dynamical equations or equa- 
tions of motion, 


dB 


ar = f(E B), (1.1.28a) 
dE 
ar = fe(EB). (1.1.28b) 


Locality. These are indeed only two of the four Maxwell 
equations, those with time derivatives in them. Note that 
on the right-hand side | have for convenience suppressed 
the dependence on the spatial derivatives of the fields, be- 
cause at a given time t these can be calculated from the 
field themselves at time t .The main point here is that the 
equations are local: loosely speaking one could consider 
the fields as an infinite collection of independent variables 
which are only locally coupled. 


Constraints. The other pair of equations without time deriva- 
tives are constraint equations; in order for the system to 
be consistent, these have to be obeyed at all times. So if 
these equations are satisfied at some initial time t = 0, 
then consistency of the system requires that they remain 
valid for all t, and this requires that the time derivatives of 
those equations should vanish. 


This, in turn, can be proven from the Maxwell equations. 
For the second equation the argument is quite straightfor- 
ward: one finds that by taking the time derivative of that 
equation one obtains the same expression as by taking 


the divergence of the right-hand side of the fourth equa- 
tion. The latter, in turn, equals V - (V x E), which van- 
ishes identically, meaning that it is zero for any field E. 
This is discussed in the Math Excursion on vector calcu- 
lus on page 621 of Part Ill. For the first and third Maxwell 
equations a similar argument can be applied, comparing 
the time derivative of the first and the divergence of the 
third equation we see that consistency requires the follow- 
ing relation to hold: 


(1.1.29) 


This equation is the continuity equation for electric charge, 
it relates the time derivative of the charge in a given vol- 
ume with the current through the surface bounding that 
volume. In other words, it is the local conservation law 
for electric charge. The conclusion is that the consistency 
of the Maxwell equations requires local charge conserva- 
tion. 


Constraint equations can be used to reduce the number of 
independent degrees of freedom, fields in this case. What 
that means is that electromagnetism does not really have 
two times three equals six independent field components 
as the two equations above suggest. Maxwell’s second 
and fourth equations express two local — (x, t) dependent 
‘constraints, which reduce the number of independent field 
variables from six to four. And these correspond to the four 
gauge potentials we will get to shortly. Nevertheless, from 
this dynamical systems point of view there is a remarkable 
structural similarity between the mechanical and electro- 
magnetic systems. 


The electromagnetic force exerted on a charge. The 
Maxwell equations feature external sources in terms of 
charges and currents. Clearly these refer to charged par- 
ticles or collectives thereof. So to complete the dynami- 
cal system approach we should also include the dynamics 
of the charges and currents. This in turn means that we 
specify the forces that these are subject to in given elec- 
tric and magnetic fields. The expression for this so-called 
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Figure 1.1.20: Motion of charge in an electromagnetic field. 
This figure illustrates how the Lorentz force works on a charged 
particle. We show that the force has two contributions: one pro- 
portional to and in the direction of the electric field and one pro- 
portional to the magnetic field and the velocity in a direction per- 
pendicular to the field and the velocity. 


Lorentz force exerted on a charge at a point (x,t) by the 
fields E(x, t) and B(x, t) is the following: 


F = q(E+~ xB). (1.1.30) 


The first term is a force in the direction of the electric field 
that any charge will feel, and the second term is the mag- 
netic, so-called Lorentz force, which is orthogonal to the 
velocity of the charged particle. It is proportional to the 
magnitude of the current j = qv and clearly vanishes when 
a particle is at rest. The fact that the magnetic component 
of the force is perpendicular to the velocity means that that 
component is always perpendicular to the trajectory, and 
consequently implies that the magnetic field cannot do any 
work on the charge. A charge in a constant magnetic field 
perpendicular to its velocity would therefore move in a cir- 
cular orbit as we depicted in Figure 1.1.21. 


Clearly, the dynamical system to be solved is the cou- 
pled system of Newton’s and Maxwell's equations where 


Figure 1.1.21: Motion of charge in a constant magnetic field. 
This figure shows the orbit of a charged particle with a velocity 
perpendicular to the field. The force is constant and perpendic- 
ular to the velocity and will cause the particle to have a circu- 
lar orbit. As the force is always perpendicular to the orbit the 
magnetic field does not do any work, and the magnitude of the 
velocity remains constant. 


Newton’s equations have to include the Lorentz force and 
the charge(s) and their currents have to be included as 
sources in the Maxwell equations. This system is of course 
non-linear because of the feedback caused by the interac- 
tion terms. 


We will later show how the electromagnetic interaction af- 
fects the energy function or the Hamiltonian of a charged 
particle, but that is more conveniently expressed in terms 
of the gauge potentials that we will introduce shortly. 


Field energy and momentum. If we put a charged parti- 
cle in a constant electric field, the field will exert a constant 
force on the particle which will therefore start to acceler- 
ate uniformly. This in turn means that its energy will in- 
crease. Now if we want to maintain the sacred principle 
of overall energy conservation, then one is forced to as- 
sume that the electromagnetic field also carries energy. 
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Indeed, the mere fact that the Maxwell equations with- 
out any charges and currents describe propagating waves 
means that the fields should carry both energy and mo- 
mentum. Furthermore, once properly defined, it turns out 
that both the total energy and momentum for the whole 
system including charges and currents and fields is con- 
served again, assuming of course that the fields evolve 
according to Maxwell’s equations. 


Because the electric and magnetic fields as fundamental 
variables are space-time dependent — we say that they de- 
scribe local degrees of freedom, it is then natural to define 
field energy and field momentum densities. This means 
that in order to get the total energy/momentum within a 
given volume one has to integrate the densities over that 
volume. 

The expression for the energy of the electromagnetic field 
is basically the sum of (or better the integral over) the 
contributions in all points in space of a field energy den- 
sity 

(yt) = S(B? + BP), 


which is quadratic in E and in B, where you may think 
of the first term as corresponding to the ‘kinetic energy’ 
and the second to the ‘potential energy. This total energy 
is conserved. The fields also carry a momentum density, 
which is called the Poynting vector S(x,t) = c(E x B) 
and an angular momentum density L(x) = (x x S)/c? 
in complete analogy with particle angular momentum L = 
x x p. This comes out most clearly in the electromag- 
netic wave solutions to the Maxwell equations illustrated 
in Figure 1.1.23, which shows that the fields form propagat- 
ing waves that are transversal, meaning that at any point in 
space the vectors E and B are mutually perpendicular, and 
also perpendicular to the direction of propagation. From 
the figure one verifies that the field momentum density S 
is, as expected, directed along the propagation direction of 
the wave. 


Three fundamental principles. The remainder of this 


section is devoted to two fundamental symmetry principles 
underlying the Maxwell equations of electromagnetism. 
The first principle refers to the notion of Lorentz invariance 
which forms a key link with the theory of relativity. 

The second principle refers to the notion of gauge invari- 
ance which amounts to a hidden redundancy that is present 
if we describe electromagnetism in terms of E and B fields 
as we usually do. 

The third principle concerns the quantum nature of elec- 
tromagnetism, of which the most basic manifestation is 
that we have to think of electromagnetic fields in terms of 
particle-like excitations or quanta, called photons. The lat- 
ter principle is the main subject of the book and will be fully 
explored in the forthcoming chapters; we will not discuss it 
any further here. 


The Maxwell equations refer to the fields E and B, because 
these fields are the physical fields we can measure quite 
directly. The equations are beautiful, but that beauty has 
its price in the sense that the description is highly redun- 
dant and therefore basically inefficient! The reason we al- 
ready touch on these rather sophisticated symmetry prin- 
ciples here is that in hindsight it turns out that these two 
invariances, combined with the principles of quantum the- 
ory, really form the conceptual backbone of all of modern 
fundamental physics. The tremendously successful Stan- 
dard Model of fundamental forces and particles is a partic- 
ular expression of these three underlying principles. More- 
over, understanding these principles played an essential 
guiding role in discovering the Standard Model. 


Electromagnetic waves 


The source-free Maxwell equations can be recast in the 
form of wave equations. The wave equations manifestly 
display the underlying Lorentz or relativistic invariance of 
the Maxwell theory. In that sense Maxwell theory was the 
cradle of relativity. 
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Relativistic wave equations. JS 
By mathematically manipulating them we can cast the Max- 
well equations (1.1.28) in an alternative form. In the case 
of vanishing sources — with zero charges and currents in 
other words — they take the form of two wave equations: 
one for the electric and and one for the magnetic field.6 


These wave equations are Lorentz and therefore relativisti- 
cally invariant, which means, as we will discuss later in the 
corresponding section on page 60, that they will take the 
same form for different observers that move at a constant 
speed with respect to one another. Such observers have 
coordinate frames that are different, but the statement is 
that the frames of two such observers are related by a so- 
called Lorentz transformation, which depends on their rel- 
ative velocity. An alternative way to express the fact that 
the equations ‘look the same’ for the different observers is 
to say that the equations are invariant under Lorentz trans- 
formations. 


Four-vectors. Let us look at this a little closer. In ordinary 
space we can define a coordinate vector x, and then we 
know that a rotation will change the direction it is pointing. 
What does not change is the dot product or the length of 
the vector, x-x = x*. The length of any vector is invari- 
ant under rotations, and this also holds therefore for the 
square of the vector operator V . To explain the notions 
of Lorentz invariance and of space-time we do something 
similar. First we define a space-time coordinate four-vector 
xt = {x}, x} with x? = ct, the factor c is there to also 
give xo the dimension of a length. Next we define the rel- 
ativistic ‘length’ or space-time interval s of that coordinate 
vector by the relation s? = x'x, = x}—x-x, where 
indeed the repeated upper and lower u index by definition 
means that we have to sum over its range 0, ..., 3 , with the 
minus sign for the spatial components included. The no- 
tion of Lorentz invariance refers now to the fact that the 


6A typical ‘wave equation’ is discussed in the Math Excursion at the 
end of Volume III on page 613. 


Figure |.1.22: Aurora Borealis. The Northern Lights are caused 
by collisions of charged particles coming from the sun and gas 
particles from the earth’s atmosphere. The most common au- 
roral color, a pale yellowish-green, is produced by oxygen mol- 
ecules located about 60 miles above the earth. Rare, all-red 
auroras are produced by high-altitude oxygen, at heights of up 
to 200 miles. (Source: Wikimedia) 


space-time interval is invariant under Lorentz transforma- 
tions, just like the length of an ordinary vector is invariant 
under rotations. So Lorentz transformations are the gen- 
eralization of ordinary rotations in three-dimensional Eu- 
clidean space to four-dimensional space-time (also called 
Minkowski space). 


The box-operator. The wave equations feature second or- 
der spatial and time derivatives in a unique relativistically 
invariant combination denoted by 


1 0? 
c2 ðt? 
The electromagnetic wave equations can then simply be 
written as 


Vv. (1.1.31) 


—_ ddu = 


E=0, 
B=0. 


(1.1.32a) 
(1.1.32b) 
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Wavelength À [m] 


Figure 1.1.23: Electromagnetic wave. This is a propagating 
wave of periodic electric and magnetic fields. The polarizations 
of the electric and magnetic field are orthogonal, and both are 
orthogonal to the direction of propagation which is along the di- 
rection of the field momentum S. 


In the ‘box operator O we see that time and space ap- 
pear on an equal footing, which amounts to saying that 
this operator is relativistically invariant. The ‘box’ operator 
is the relativistic wave operator, and the equations above 
are the equations for electromagnetic waves. And indeed, 
it was this property of invariance of the Maxwell equations 
under the Lorentz transformations, named after its discov- 
erer, the Dutch physicist and early Nobel laureate Hendrik 
Antoon Lorentz, which was a crucial key used by Einstein 
to unlock the gateway to the world of relativity. 


Basic properties of waves. Like all waves, the electromag- 
netic waves are characterized by a wavelength A, a fre- 
quency v, and a velocity v which in this case of course 
equals the speed of light, |v| = c . These three quantities 
are not independent, since they satisfy the relation v = 
c/A . So electromagnetic waves are special in that they al- 
ways travel with the speed of light, you can’t speed them 
up or slow them down. If you put more energy into the 


Figure 1.1.24: Electromagnetic radiation spectrum. Classi- 
cal electromagnetic waves can have any wavelength, from very 
long wavelength radio waves to the ultra short wavelength hard 
gamma rays. Visible light represents a narrow range in the cen- 
ter. 


waves, two things may happen: (i) the amplitudes of com- 
ponents may go up (the signal becomes more intense), 
and/or (ii) the frequency may increase, meaning that the 
colour (in the case of light) will be shifted towards the blue. 
In the quantum world where we think of photons or par- 
ticles of light, the corresponding mechanisms are, (i) that 
we can create more particles of light, or (ii) we can give 
the particles themselves more energy by increasing the 
frequency. 


We have depicted the characteristic spatial structure of a 
classical electromagnetic wave in Figurel.1.23, and one 
sees that for such waves the directions (or polarizations) 
of the electric and magnetic field amplitudes are orthog- 
onal and orthogonal to the propagation direction as well. 
The discovery of these wavelike solutions was a seminal 
contribution to electromagnetic theory, because it unified 
electromagnetism with the field of optics. The waves can 
in principle have any frequency or wavelength. We have 
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sketched the spectrum of electromagnetic radiation in Fig- 
ure 1.1.24 , from which we see that spectrum of visible light 
only covers a narrow range in the center. At the long wave- 
length side the spectrum continues via the infrared into 
the micro and radio waves. On the short wavelength side 
it continues in the ultraviolet via X-rays into hard gamma 
rays. This side of the spectrum corresponds to ionizing 
radiation, where ionizing means that the electrons in the 
outer shells of atoms and molecules will be kicked out so 
that positively charged ions stay behind. This among other 
things means that this radiation is very damaging to bio- 
logical tissue and one should avoid being exposed to it. 
In other words, avoid spending the weekend on a tropical 
beach without sunscreen. E 


Lorentz invariance: the key to relativity I 


We introduce the electromagnetic potentials, and rewrit- 
ing the electromagnetic fields in terms of these reduces 
the number of independent equations to four. In this form 
the invariance of the system under Lorentz transformations 
becomes manifest, establishing that the system is fully rel- 
ativistic. This and the following section basically show a 
form in which the Maxwell equations can be cast that maxi- 
mally exhibits their fundamental structure and beauty. 


Gauge potentials. It is interesting that in the context of 
quantum theory it is far more profitable to use a different 
parametrization of the electromagnetic field in terms of so- 
called gauge potentials denoted by A,,(x,t). As before, 
the index u runs from 0,...,3, with O the time component 
and 1, 2,3 the space components. 


The four-vector A,, = (V, —A) are the electromagnetic po- 
tentials where V is often referred to as the electrostatic or 
scalar potential and A as the vector potential. From these 
potentials the electric and magnetic field can be calculated 


29 
directly through the defining relationships: 
B=V xA, (1.1.33a) 
10 
E=—-VV—-—A. (1.1.33b) 
c ot 


Let us indicate how these expressions come about. One 
may show that for any magnetic field configuration B with 
zero divergence, meaning that it satisfies equation (I.1.26b), 
there is a vector field A that satisfies equation (1.1.33a). 
In fact that A is not unique as we'll see later. Indeed 
one finds that the equality V -(V x A) = 0 holds for 
any A; it is a mathematical identity which basically follows 
from the definition of the vector derivative V. If we pro- 
ceed by substituting this expression of B into the equation 
(l.1.26d), we get an equation of the type V x C = 0, with 
C =E+1!4A. Now there is another identity that says that 
any field C , whose rotation vanishes, can be written as a 
gradient of some scalar field V. This means that we may 
write C = VV, from which the equation (I.1.33b) then 
follows. So by changing from the E and B fields to the 
potential A, = (V,—A) we have identically satisfied two 
of the four Maxwell equations. From the other two follow 
equations that the gauge potentials have to satisfy. 


The electromagnetic field strength. You might wonder 
why | — clearly being in love with relativity — don’t come 
up with four vectors E, and B, . Alas, ‘It ain’t necessarily 
so.... Better even, ‘it just ain't gonna work!’ The appro- 
priate relativistic place for the electric and magnetic fields 
is that they correspond to the components of an antisym- 
metric two index object (a tensor) called the field strength 


Fuv: 


Fuv = OpAy — OvAn . (1.1 34) 
The three spatial components Fij correspond with the com- 
ponents of B, and the space-time components Fo; corre- 
spond with the components of E. The u — v antisymmetry 
can be visualized more conveniently by writing F as an an- 
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tisymmetric 4 x 4 matrix: 
(1.1.35) 


It clearly shows how the components of the E and B fields 
are not part of four vectors, which means that the E and B 
components may mix if we make a Lorentz transformation 
from one reference frame to another, just like the space 
and time components of the position four-vector do. This 
mixing is not entirely unexpected, since if we can transform 
a particle at rest in one frame to a moving particle in an- 
other frame, then the static particle has a pure radial elec- 
tric field. The moving charge, however, is like a current and 
generates a magnetic field as well. So one expects that 
under a Lorentz transformation the E and B fields should 
mix. And if each of them was a four-vector, transformations 
would not mix the two sets of components. 


From the manifestly relativistic definitions above, we see 
that the symmetry between electric and magnetic fields is 
particularly special to four-dimensional space-time. If we 
consider what the matrix Fy would look like in different di- 
mensions, this becomes very clear: (i) in two-dimensional 
space-time there is only a single electric field component 
along the space direction and there is no magnetic field; (ii) 
in three dimensions we have an electric vector field with 
two components and a single component magnetic field 
which is therefore like a (pseudo) scalar. 


We can now also write the Maxwell equations in manifestly 
relativistic form. The equations with sources (I.1.26a) and 
(I.1.26c) will then read: 


1, 
0” Fury = oe? 


(1.1.36) 
where a repeated upper and lower index implies a sum- 
mation over that index from 0, ...,3 . On the right-hand side 
we have the current ju , which is now also a four-vector. Its 


time component jọ is equal to the charge density p times 
the velocity of light c, and the spatial components ji are 
the components of the usual electric current-density vec- 
torj : 

jp = (ce, j). (1.1.37) 
The other two — sourceless — Maxwell equations can also 
be written in a manifestly Lorentz invariant way as, 

0” Fuy =0. (1.1.38) 
Where we have constructed the dual field strength Fw 
marked with a ‘tilde, by applying the electric-magnetic du- 
ality transformation discussed on page 22, to Fuy, yield- 


ing, 


f= i (1.1.39) 
-B; & -E 0 


Again these sourceless equations are solved identically by 
substituting the field strength in terms of the gauge po- 
tentials. In other words, by substituting the expressions 
(1.1.33) of E and B in terms of the gauge potentials into the 
equation (1.1.38). 


The action for the Maxwell field. We have, in the closing 
subsection about classical mechanics, highlighted the im- 
portance of the concept of an action (and Lagrangian) for 
relativistic systems. As the Maxwell system is a relativistic 
system with the fields and their derivatives as fundamen- 
tal degrees of freedom, we should ask whether there is a 
suitable form of the Lagrange formalism in this case. The 
answer is affirmative, so let us show what it looks like. First 
of all let us introduce the Lagrangian density which corre- 
sponds to the Lorentz invariant expression that is quadratic 
in the derivatives of the field: 


1 
L(Ay, OvAy) = -grw — jA". (1.1.40) 
The Lagrangian L would be given by the integration over 


space of the density £, and the action S is obtained by 
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an additional integration over time. This yields the fully 
covariant expression, 


S[Aul = | L(Ay, dyAy) dfx. (1.1.41) 
One may show that the Maxwell equations (1.1.36) corre- 
spond to the Euler-Lagrange equations for this action. 


Current conservation. The previous equations require 
that the current j is conserved, which means to say that 

Oo jn=0. (1.1.42) 
The substitution of the definitions yields the continuity equa- 


tion which expresses the local conservation law for electric 
charge, 


(1.1.43) 


Integrating this equation over some volume V, it states 
that the increase of the charge in V per unit time (and di- 
vided by c) equals the net electric current flowing inward 
through the closed surface that bounds that volume. 


A way to think about this is to consider an office building 
where people go in and out. Then if we state that the num- 
ber of people in the building is locally conserved, it means 
that the total number of people in the building is equal to 
the number that are already in there, plus or minus the 
people who enter or leave the building. It is local because 
you can apply it to any volume, for example the law also 
applies to any floor of the building, or any individual room 
for that matter. 


The energy of a charged particle. A good reason to 
introduce the gauge potentials is that the coupling of the 
electromagnetic field to charged particles and fields takes 
a particularly simple form. The correct expression for the 
interaction with a charged particle is directly obtained by 
replacing, in the non-interacting particle theory, the mo- 
mentum vector p of the particle by p + qA/c, and the 


energy E by E — qV, where q is the charge of the parti- 
cle. The energy function or Hamiltonian H for the charged 
particle simply becomes: 


H-qV= p+ Say. (1.1.44) 


2m 


From this expression for the Hamiltonian, one obtains the 
equation of motion for a charged particle, which yields as 
one might expect the Newton force law featuring the Lorentz 
force: 

dp _ 

T 
What has become clear from my exposition so far is that 
the electromagnetic ‘field’ as we know it in classical phys- 
ics basically corresponds to a system with an ‘infinite’ num- 
ber of degrees of freedom, namely the A , or B and E fields 
that can vary at any point in space, so that a field repre- 
sents a degree of freedom in any point of space. We have 
emphasized the dynamical systems perspective because 
it is significant if we consider the quantum theories of fields 
and want to compare them to the quantum theory of parti- 
cles. E 


q(E+ tv x B). (1.1.45) 


The charge degree of freedom. If we speak of ‘a charge, 
we commonly imagine a point-like particle carrying a cer- 
tain charge, and as far as we know that charge q is quan- 
tized in units of the fundamental electron-charge —e. If the 
charge q has a velocity, it corresponds to a current j = qv, 
localized at the position of the particle. Often, though, we 
think of a charge density which is taken to be a continuous 
distribution. 


The charge and current density (cp, j) become the charge 
and current of a point charge q and j, multiplied by a dis- 
tribution function f?, which specifies how the charge and 
currents are spread around x,,(t). 


A preliminary leap into quantum mechanics. At this 
point it may be illuminating to jump ahead into the quantum 
domain where things are so very different. For one thing, 
in quantum theory a charged particle is represented by a 
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the complex function, the so-called wavefunction ¥(x, t), 
which describes the quantum states of the particle. ‘Com- 
plex’ here means that the wavefunction has a ‘real’ and 
‘imaginary’ part, and we may write the wavefunction there- 
fore as W(x, t) = e '* f(x, t), the product of a local factor 
with a phase a(x, t) and a real function f(x, t). Whereas 
the state at some time t of a classical particle is deter- 
mined by specifying its position, velocity (and parameters 
like mass and charge), in quantum physics the state is 
specified by the wavefunction which is defined over all of 
space. This means that the Lorentz equation of motion for 
a charged particle (1.1.45) will turn into the famous Schr6- 
dinger equation for the wavefunction  . Quite a difference 
indeed, and in the second Volume of the book we will fully 
explore what it all implies. 


In quantum theory for a single particle the momentum p is 
represented by the differential operator p = —ihV which 
is Supposed to act on the wavefunction. And it is basically 
here that the famous Planck constant ‘h-bar A = h/27 
enters the mathematical formalism. The coupling with the 
vector potential is, as mentioned before, implemented fol- 
lowing the minimal replacement p > p + qA/c, mean- 
ing the in the quantum world we have to replace the ordi- 
nary vector derivative V with the covariant derivative D = 
V +iqA/ħc. 


The distribution |v]? = Y*Y = f(x, t)? represents the prob- 
ability density of finding the particle at the position x upon 
a position measurement at time t. This distribution is in- 
dependent of the phase «œ. So it is not the charge which 
is distributed over space, it is the probability of finding all 
of that charge at a certain location in a position measure- 
ment of the particle. That is what ‘charge density’ means 
in the quantum theory of a charged particle. Similarly, the 


electric current density takes the form: 
j = —ih(WVY* — Y* VY) = (AV), (1.1.46) 


proportional to the same distribution, and in some indirect 
sense ‘proportional’ to the momentum which brings in the 


factor of h and the phase «. We will return to this wave- 
function towards the end of this section where we discuss 
the ‘quantization’ of charge which can be linked to this par- 
ticular quantum representation of a particle. 


The wave equation for the potentials. Having defined 
the field strength in terms of the potentials in the equation 
(11.6.8) , one finds that (in a suitable gauge) the Maxwell 
equations (1.1.36) reduce to the relativistic wave equation 


for the potentials: 


1. 
A= g 


(1.1.47) 


Also this form of the equations manifestly displays the rel- 
ativistic invariance of the system: the potentials, and the 
charge density and current, are neatly organized in four- 
component relativistic vectors. The wave depicted in Fig- 
ure 1.1.23 corresponds to one of the solutions of the equa- 
tion (1.1.47) in empty space (without charges or currents). 


The solutions of the wave equation are not surprisingly the 
transversal electromagnetic waves. The wave solution for 
the gauge potential will look like A,, ~ £u exp(ik - x — wt) 
with the polarization four-vector €, the so-called wave- 
vector k and angular frequency w. Substitution in the 
wave equation shows that we have to impose the condi- 
tion that |k|* — w*/c* = 0. The solution corresponds with 
a wave that propagates in the direction of the vector k, 
where it has a wavelength equal A = 27t/|k|, and a fre- 
quency v = w/27. And as expected, the wave condition 
v =c/N is satisfied. 


To see the link with the wave depicted in Figure 1.1.23, we 
have to do some more work. First we have to realize that 
the derivative vector nabla acting on the gauge potential 
just brings down a factor ~ k while the time derivative 
brings down a factor ~ w. Then we can look at the def- 
initions (1.1.33) to conclude that B ~ k x A, while we can 
choose k- A = 0 which gives E ~ wA. With these choices 
we have ascertained that the three vectors E, B and k are 
mutually orthogonal and that indeed the field momentum 
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is in the direction of k as S ~ (E x B) ~ k. By finally not- 
ing that the waves for A, E and B are in phase, we have 
verified all the features of the figure. 


This wave equation for the potentials creates the best start- 
ing point for the ‘quantization’ of the electromagnetic field. 
As we will see later, the A fields are preferred for two rea- 
sons. Firstly, if one wants to quantize the electromagnetic 
field, it is convenient to think of the A; fields as generalized 
‘coordinates,’ while the electric fields E; ~ 0A;/odt are like 
the ‘momenta’ of the field. 


It is actually a quite remarkable fact about the Maxwell 
equations that as equations they survived both the rela- 
tivity and the quantum revolution. As we will see it is in 
the interpretation of going from classical fields to those of 
quantum that the great revolution took place. 


Gauge invariance: beauty and redundance SS 


The introduction of gauge potentials naturally leads to the 
notion of gauge invariance. In one sense it signals a resid- 
ual redundancy in the formulation of the theory. This prin- 
ciple is worth exploring as it plays a crucial role in the for- 
mulations of all theories that describe fundamental inter- 
actions. 


Once you write the equations in terms of the gauge po- 
tentials, another fundamental but somewhat elusive prop- 
erty becomes apparent. We have successfully reduced 
the electromagnetic field from six to four components, by 
introducing the potentials A,, , but what we will argue next 
is that there is still a redundancy in the definition of the sys- 
tem. Whereas giving the gauge potentials yields a unique 
answer for the physical E and B fields, the converse is not 
true: a given set of E and B fields does not uniquely fix 
the gauge potentials, and this redundancy is called gauge 
invariance. 


Figure |.1.25: Gauge transformations of the author as Mr Vec- 
tor Potential. The pictures illustrate the idea of smooth local 
transformations. The information content (the person) is the 
same but the representations or copies are different. 


Let us change the gauge potential by — yes indeed — a 
gauge transformation involving an arbitrary function A = 
A(x, t) as follows: 


Au AL =A — A, (1.1.48) 


where ^ is an arbitrary function. If we calculate the trans- 
formed fields E’ and B’, we learn that the field compo- 
nents are invariant: E’ = E and B’ = B because for any 
pair of indices u and v we have that 0,0,A — 0y0,A =0. 
In other words the contributions of the gauge function can- 
cell. 


Let me note that the gauge transformations form a group: 
they satisfy the group property that two successive trans- 
formations, form again a gauge transformation (where A = 
A; + A2).” The observable physics, which resides in the 
E and B fields, is independent of A , and therefore the the- 
ory is said to be gauge invariant. In other words, we have 


7The curious reader may like to jump ahead and look at the Math 
Excursion on groups on page 635 of Part Ill. 
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the freedom to choose any convenient function A to de- 
scribe the physics, which is referred to as the freedom to 
choose a ‘suitable gauge. This choice is useful for exam- 
ple if one needs to construct explicit solutions, but if one 
has to quantize the electromagnetic field, then this bless- 
ing becomes a burden. You could say that the description 
of the physics in terms of the gauge potentials is elegant 
but at the same time redundant. It obscures to a certain 
extent what exactly the real physical degrees of freedom of 
the (quantized) electromagnetic field are. The wave equa- 
tion for each of the four components of the vector potential 
suggests that there are four independent components to 
the field, yet looking at the electromagnetic wave of Fig- 
ure 1.1.23 we see that in fact it has only two physical com- 
ponents. This further reduction of degrees of freedom from 
four to two is due to the gauge invariance of the equa- 
tions. 


Gauge symmetry and charge conservation. The Maxwell 
equation (1.1.36) and the fact just mentioned that the field 
strength is gauge invariant means that this system is only 
consistent if the current itself is also gauge invariant. This 
property can be used to show that the continuity equation 
0"j,, = 0 follows from gauge invariance. In other words 
the conservation of charge is a consequence of the gauge 
symmetry. 


A nice way to show this more directly is by noting that the 
interaction term between the current and the gauge po- 
tentials has to be (i) local, (ii) Lorentz-invariant, and (iii) 
has to give rise to the correct Maxwell equations, which 
means that it has to be of the form f Aj" dfx. If we now 
make the gauge transformation (1.1.48), the coupling term 
acquires an extra term f (0,A)j# dfx , which has to vanish 
if the theory is gauge invariant. This term can be recast in 
a convenient form using the following mathematical iden- 
tity: 


| au(Aj) dx = | (oaj atx + | A(O, j") dx, 


which is just writing the derivative of a product of two func- 
tions as a sum of derivatives on the individual factors and 
then integrating over space-time. The first term can be 
integrated to yield the integrand integrated over the three- 
dimensional boundary of the space-time volume, but on 
the boundary of space-time we assume the current j, will 
vanish and therefore so does the integrand. And as the 
integral of zero is zero, the left-hand side of the equation 
above is zero. This in turn means that the effect of the 
gauge transformation on the interaction term equals: 

| (ð Aj! dfx = -| A (ð j") dfx. (1.1.49) 
Now the elegant argument continues by saying that be- 
cause the gauge function A(x, t) can be chosen arbitrar- 
ily, and this means that the integral condition has to be 
satisfied locally, thus we have to require 0,,j4 = 0 every- 
where. 


Stated in words, what we have shown is that imposing 
local gauge invariance requires the current to which the 
electromagnetic field couples to be conserved locally. This 
means that net charge can move around obeying the con- 
tinuity equation, but it cannot just disappear into nothing. 
This is a not so surprising but vital result that resonates 
with our earlier observations that the conservation laws of 
momentum and angular momentum were a consequence 
of the space-time symmetries being translational and rota- 
tional invariance. In that sense one can say that the gauge 
transformation is like a rotation in a kind of ‘internal space’ 
of allowed gauge transformations. This discussion will be 
taken up in more detail and generality in Chapters 1.2 and 
II.6 where we will have more to say about the geometry of 
gauge invariance. 


A non-local observable: the loop integral of A . Clearly 
the gauge potentials, as they are gauge-choice depen- 
dent, cannot be real observables, the physics resides in 
the gauge invariant observables being the electric and mag- 
netic fields. These quantities are /ocal in that they can be 
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Xo 


Figure 1.1.26: The line integral of the vector potential A„ . The 
line integral of the four-vector potential A,, from point xo to x1 
along a curve y . It ‘adds’ the projections of A,, along the tangent 
direction dx" (y) of the curve for all points along y. 


measured locally at a given point x4. We may, however, 
also consider other fundamental gauge invariant quanti- 
ties, which are intrinsically non-local and involve the line 
integral of the gauge potential A,, along a closed curve in 
space-time. 


Let us just start by considering a line integral of the vector 
potential along some curve y starting at a space-time point 
xo and terminating at point x; as depicted in Figure 1.1.26. 
We write this as follows: 


X1 


I(Y; xo, X1) = | A, dx"(y). (1.1.50) 


XO 


Now let us look what a gauge transformation does to this 
line integral: 


Xo 


= I(y; xo, x1) —A(x1) + A (xo). (1.1.51) 


Clearly the path dependent expression is only affected by 
the transformation at the start and end point. This implies 
that if we choose the start and end point to be the same, 
the resulting ‘loop integral’ will be gauge invariant as the 
gauge function A drops out. Let us take the example of a 
closed curve for a fixed time 


} A-dx =| (Vx A)-ad’s =| B-AadS=0, 
ðD D 


D 

(1.1.52) 
where fi is the unit vector perpendicular to the surface ele- 
ment d?S = dx dy of the surface D bounded by the curve 
0D. The first equality is an application of the ‘Stokes the- 
orem, which is a mathematical identity explained in the 
Math Excursion on vector calculus at the end of Part Ill. 
The second equal sign follows from using the defining rela- 
tion (1.1.33b) between the vector potential A and the mag- 
netic field B . Because the contribution of the gauge trans- 
formation drops out, this loop integral is gauge invariant 
and corresponds therefore to a physical and observable 
quantity, which is not so surprising once you realize that it 
‘measures’ the total magnetic flux ® through any surface 
D bounded by the curve, which is a gauge invariant quan- 
tity. 


Gauge versus topological invariance. Yet, there is some- 
thing quite remarkable about this result. Let us for simplic- 
ity consider a two-dimensional plane and have some non- 
vanishing magnetic flux piercing through the surface area 
bounded by — say — the unit circle, depicted as the dark re- 
gion in Figure l.1.27. Outside the unit circle the physical E 
and B fields are zero but that does not imply that the gauge 
potentials have to be zero there as well. It only requires 
that the gauge potentials are pure gauge: A, = 0,/, in 
other words, that they are a gauge transformation of field 
Ay, = 0. 


Then the result above tells us that you can measure the 
total magnetic flux © through any finite domain, by taking 
the line integral of the gauge potential around a closed loop 
which is arbitrarily far removed from that domain. You can 
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Figure 1.1.27: The loop integral of the vector potential. A line 
integral of the vector potential A along a closed spatial loop is 
a gauge invariant but non-local quantity. The dark region inside 
the loop is the region where the magnetic field is non-zero, so, 
everywhere along the loop there is zero magnetic field, yet the 
line integral will yield a non-zero magnetic flux ® . 


measure the total flux without ever entering a region where 
the magnetic field B is non-zero. Indeed there is a non- 
local, gauge invariant quantity, corresponding to a mea- 
surement outcome that may assume any non-zero value, 
and that involves probing only a region of space where all 
physical fields are zero! Quite remarkable indeed! 


Imagine we choose the closed loop around a big circle 
at infinity (the boundary of space) parametrized by (r = 
oo, @), then we find for the loop integral simply: 


I(y = SL) = Ale = 27) —A(@ = 0). (1.1.53) 


Here we run into an apparent contradiction, because on 
the one hand we argue that the gauge function has to 
be single-valued meaning that the right-hand side of the 
above equation should vanish, but on the other hand the 
left-hand side of the equation is nothing but the loop inte- 
gral (1.1.52) which equals the total flux ® ! 


The resolution of this paradox lies in the appreciation of 
what we precisely mean by a gauge transformation. We 
keep the definition simple: a gauge transformation A(x, t) 
is a smooth, single-valued function. Indeed, under such 
a transformation the value of the loop integral (1.1.52) can- 
not change. The converse also holds true, if we make a 
transformation that is not single valued, we by definition 
do change the outcome of the loop integral and thereby 
somehow have changed the magnetic field through the 
loop. 


Let us illustrate this by a simple example: imagine some- 
body tells me that they have chosen A(x, t) = b ọ, a con- 
stant times the polar angle ọọ. than the loop integral would 
give a flux ®© = A(27) — A(0) = 27 b, this does not corre- 
spond to a proper gauge transformation because it is not 
single valued. Now it is a matter of semantics what you 
want to call this transformation; some physicists call it a 
‘singular’ gauge transformation, and others call it a ‘topo- 
logically non-trivial’ gauge transformation. Presumably this 
is intended to emphasize that it looks like a gauge transfor- 
mation while strictly speaking it is not, since it is singular 
at the origin of the plane (r = 0) where ọ is not well de- 
fined. And indeed such a ‘transformation’ would ‘create’ 
a magnetic flux-line through the point (or the line) where 
r=0. 


In Chapter 1.2, in the section on the geometry of gauge 
invariance on page 96 in, we will see that there is a rig- 
orous topological characterization of the values that the 
loop integral traversing a vacuum region (or ground state 
region of some medium) can acquire. The physical situa- 
tion is determined by a mapping of the closed loop (which 
is topologically equivalent to a circle SL) in space into the 
gauge group G. The outcomes are now determined by the 
number of topologically distinct ways we can do this and 
that depends on the global structure of the group-space of 
G. 


For the case of electrodynamics where we have quan- 
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tized charges the gauge group is the phase group which 
is also topologically a circle S}. The elements can be rep- 
resented as g(x) = e'™ The constraint that follows is that 
xlo) = nọ, meaning that if we go around once in real 
space then we have to go around an integer n times in 
the gauge group (so that g(271) = g(0). So the distinct 
classes are labeled by this integer n with —co < n < +00. 
So in this theory both the electric charges and the mag- 
netic fluxes would be quantized in suitable units. And be- 
cause this number is fixed topologically, it is extremely ro- 
bust. It will not change under any smooth deformation of 
the gauge potentials — not just gauge transformations. The 
winding number n is therefore a conserved quantity under 
any smooth deformation, but because it is conserved and 
quantized for a topological reason, it is called a topological 
quantum number. 


We will see later that both gauge invariance and topologi- 
cal invariance play a fundamental role in quantum theory. 
The loop integral we just discussed is an observable quan- 
tity that can be measured as a shift in the interference 
pattern in a double-slit experiment with electrons, and is 
known as the Aharonov-Bohm effect effect, which is ex- 
amined in Chapter II.3, after the theorists who proposed 
this experiment. This effect is a special case of a gener- 
alization known as the Berry phase which we cover in the 
same chapter. In an entirely different context the topologi- 
cal invariance of the loop integral can also be linked to the 
all-important feature of the quantum statistics properties of 
different particle types, like bosons, or fermions as we will 
discuss in Chapter 11.5. E E 


Monopoles: Nature’s missed opportunity? IS 


Charge quantization and magnetic monopoles. There 
is a brilliant, rather early use of the gauge invariance and 
parallel transport arguments we just presented, by Paul 
Dirac. In a famous 1931 article he boldly proposed the ex- 
istence of magnetic monopoles, and proved that the mere 
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Figure 1.1.28: Dirac in doubt. This is a fragment from a let- 
ter of Dirac to Abdus Salam from 1981, declining an invitation 
to attend a monopole meeting at the ICTP in Trieste. (Source: 
Proceedings of Monopoles in QFT, ICTP, Trieste, 1981.) 


existence of just a single magnetic monopole in the whole 
universe would suffice to explain the observed quantiza- 
tion of electric charge! We have already mentioned that 
a magnetic monopole has never been observed, but that 
fact by itself does not really exclude the possibility that they 
somehow exist. May be they once existed and subse- 
quently disappeared through some annihilation process, 
given that to our knowledge that was what happened to 
anti-matter, for example. ‘To be or not to be, that is the 
question, because just being there would suffice! 


Dirac’s proof goes in fact one way: he shows that if a 
monopole would exist, then electric charge would have to 
be quantized in integer multiples of some minimal charge 
e. In the concluding section of his 1931 article, after not- 
ing that the charges we have observed in nature are quan- 
tized, he modestly states: ‘One would be surprised if na- 
ture wouldn’t have made use of it.’ 


In practice we can of course do without monopoles be- 
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cause all magnetic phenomena that have been observed 
can be explained as being caused by electric currents, 
moving charges in other words. In all observed magnetic 
phenomena, there are always a combination of north and 
south poles involved. If you break a bar magnet into two 
pieces, you get two bar magnets, not a separated north 
and south pole. And that rule so far holds on all scales, 
even the smallest accessible. As we mentioned before, 
this is also the reason that the the sourceless Maxwell 
equation for the magnetic field reads V - B = 0, where 
the zero on the right-hand side expresses the merciless 
verdict: ‘No monopoles!’ In theory there could have been 
a ‘magnetic’ source term there, but there is none. How- 
ever, the price for it not being there is that the observed 
quantization of electric charge for the moment remains a 
mystery. A mystery that has not even been resolved by 
today’s Standard Model of elementary particles and fun- 
damental forces. 


The charge quantization puzzle would actually be resolved 
if the so-called Grand Unified Theories or GUTS turned out 
to be correct. These theories unify all non-gravitational in- 
teractions in one overarching model, as we will discuss 
in Chapter 1.4. This means that different particle types 
like quarks and electrons belong to a single representa- 
tion which links their relative charges. Believe it or not, this 
is precisely the case because those models necessarily 
contain magnetic monopoles in their spectrum as was bril- 
liantly shown by Gerard ’t Hooft and Alexander Polyakov 
independently in 1974. And indeed in these models elec- 
tric charge is quantized. However, these Grand Unified 
monopoles would be so heavy, of the order of 10!° pro- 
ton masses, that there is no hope making them, even in a 
fancy lab like CERN. Yet, never say never, may be Dirac 
will turn out to be right after all. This in spite of the doubt 
that Dirac himself cast over his prediction towards the end 
of his life, as expressed in the short note to Abdus Salam 
depicted in Figure 1.1.28. 


Dirac’s argument. Dirac’s argument for charge quanti- 


zation is sketched in Figurel.1.29. Imagine if we put a 
monopole with magnetic charge g in the origin, then the 
magnetic field would point radially outward. The total flux 
going out through any surface enclosing the monopole is 
then equal to g. Now imagine the situation sketched in 
Figure |.1.29, where | draw a sphere around the monopole 
and | take a charge q and make a closed loop on the sur- 
face. Clearly the product of the charge and the gauge in- 
variant loop integral equals the charge times the flux going 
through the loop. Let me first look at the flux going through 
the ‘northern’ surface segment, giving me a flux going up- 
ward, say Dy = a. However, | could also have taken the 
flux through the ‘southern sector’ going down, then that 
flux would be ®s = —(g/c— a). The phase factors have 
to be the same (because the flux through any two surfaces 
bounded by the loop has to be) so we get the following 
condition on the phases themselves: 

q m q ( 


lie Re 


g—a)+2an > qg=2nnħc. (1.1.54) 


Indeed the flux « drops out as it should, because the argu- 
ment holds for any arbitrary closed loop on the surface. 
Dirac used the argument exactly the other way around: 
if there somewhere exists a minimal magnetic charge g, 
then qg = 27mhc. This in turn implies: q = ne, so that 
e g = 27ħc , where e is the minimal electric charge. There- 
fore he showed that the existence of a magnetic monopole 
implies the charge quantization that we observe in nature. 


Conversely, it is also true that if there existed two particles 
with incommensurate charges, meaning to say that their 
ratio would be some non-fractional real number like 7 or 
V2, then that fact by itself would exclude the existence of 
magnetic monopoles. So we are left with a stunningly sim- 
ple and profound explanation of the observed quantization 
of electric charge, except for the slightly inconvenient fact 
that we haven’t seen any monopole (yet)! 


The monopole or Hopf bundle. We have been some- 
what cavalier about the precise argument. You could even 
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qg = 2mnhe 


Figure 1.1.29: Electric charge quantization. This figure il- 
lustrates Dirac’s 1931 argument for the quantization of elec- 
tric charge based on the hypothetical existence of a magnetic 
monopole. To describe the monopole field with potentials re- 
quires at least two overlapping patches with potentials A+ . 


claim that | arrived at the correct answer by incorrect rea- 
soning. You see, the moment | put a magnetic source on 
the right-hand side of the magnetic Maxwell equation, then 
it is no longer sourceless. In that case the mathematical 
identity that V.(V x A) = 0 of equation (A.11a) can no 
longer hold, which appears to imply that you cannot write 
the magnetic field in terms of potentials if monopoles are 
present. Fortunately the situation is not that bad, because 
the proper use of the potentials turns out to be more subtle. 
In fact you can still use them, but only locally, as there is 
a topological obstruction to write a single potential to give 
the magnetic field everywhere on a surface fully enclosing 
the monopole. Somewhere on that surface that potential 
would become singular and the description in terms of a 
gauge potential would break down. There is a mathemat- 
ical resolution however, but it is somewhat complicated, 
and it reveals a fundamental aspect of gauge theories in 
general. And that is the reason to explore this. 


What | am going to to describe you is the mathematical 
concept of a fibre bundle, and we will describe these in 
more general terms in the section on the ‘Physics of ge- 
ometry’ in the next chapter. You could say that we have to 
enlarge the mathematical framework to that of fibre bun- 
dles to allow for situations we couldn’t properly cope with 
before, like having magnetic monopoles. 


We start by introducing two coordinate patches S, and S_ 
that cover the sphere, each having the topology of a disc, 
that have an overlap region with the topology of a cylin- 
der. This is depicted in Figure 1.2.30 on page 86 for the 
sphere S?, with the blue and green patches S, and S_, 
and their overlap region containing the equator. Then we 
define two gauge potentials, say A and A_ on these two 
patches that exactly give the magnetic fields present on 
the patches. So we don’t care what A+ do outside their 
patch, they well may develop a singularity there but as we 
don’t use them there it doesn’t matter. In the overlap region 
these potentials define strictly identical magnetic fields and 
therefore have to be related by a gauge trasformation. This 
is shown in Figure 1.1.29, where in the overlap region the 
two gauge potentials have to be gauge transformations of 
each other. In terms of equations the statement just made 
read: 


for x € S4 
xE (S4 N S_) 


V x A4 =B4 , (1.1.55a) 
B,=B_=B, __ (I.1.55b) 
A_=A,—VWA. (I.1.55c) 


Note that although locally the potentials produce the same 
magnetic field, what is also clear from the figure is that 
when we take the loop integral in the overlap region — 
around the equator for example — then for eA, , we get the 
monopole flux through the northern hemisphere eg/2hc 
, but for A_ we get the flux through the southern hemi- 
sphere which has to yield the opposite —eg/2hc. This 
means that the loop-integral over the gauge transforma- 
tion has to be equal to their difference: 


: } Oo = 2 Ge hie T. 


As 
Re J ð Re Rc ete) 
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Figure 1.1.30: Parallel transport of charge vector or phase. 
around the equator in the monopole field. The phase shifts 
+ calculated from A+ in the northern/southern hemisphere re- 
spectively have opposite signs. The requirement for a U(1) bun- 
dle is that the transition function f(@) = et^ has to be single 
valued and has in this minimal case a winding number m = 1 
because A(27) = 27. 


WwW 


If we have a field carrying a charge e it will have an elec- 

tromagnetic phase factor e'*, with a local defined phase 
« = a(x). This phase will change under a gauge trans- 
formation A according to « > a’ = «—eA/Rc. We may 
now impose that this charged field is single valued in which 
case it follows from he eqution (1.1.56) that we to impose 
eg/hc = 27m, the Dirac quantization condition. 


In Figure l.1.30 we show an explicit configuration of the 
phase factors for a charged field. For the elementary n = 1 
monopole the angles Bi(@) are defined as 


a 240 
p= | Ax dl = + . 


(1.1.57) 


We learn that the difference between the two line integrals 
is given by A = B+— BP- = @. The loop integrals are 
gauge invariant and A(27t) = 271, which means that the 
transition function f(@) = e'4 is single valued. 


Topological sectors. The existence of this non-trivial U(1) 
fibre bundle corresponding to the fundamental monopole 
with f(g) = ei^) = et? was discovered independently 
by the German mathematician Heinz Hopf, amusingly in 
1931, the same year that Dirac wrote his monopole pa- 
per. It took about forty years before the Chinese American 
physicists Tai Tsun Wu and Chen Ning Yang discovered 
the mathematical equivalence of these remarkable works 
of the mind. The bundle space describing the fundamen- 
tal monopole is basically the three-sphere S? , and Hopf 
showed that you can consider S? as an S! (which equals 
the group U(1)) bundle over a base manifold S2. We will 
return to this topological classification of bundles in the 
next chapter. 


So the fibre bundle perspective adds an essential insight 
into our understanding of electromagnetism as a gauge 
theory. It is the discovery and classification of topologi- 
cally non-trivial sectors in the theory. These sectors are 
defined by mapping of boundaries (or overlap regions) of 
real space (which themselves are always spaces without 
boundary) into the gauge group or more generally some 
‘internal space’ These maps can be non-trivial, and if 
they are, they label certain topological sectors which de- 
fine some discrete ‘topological charge. These charges 
are therefore quantized and conserved for a topological 
reason which is not directly related to the standard sym- 
metry type argument. Indeed in electrodynamics with mo- 
nopoles the conservation of electric charge is a conse- 
quence of gauge invariance, and the conservation of mag- 
netic charge is topological in nature. 


If you look at the monopole as a two-dimensional version 
of electrodynamics on a closed surface, then the total in- 
tegral of the magnetic field strength over that closed sur- 
face would always have to be an integer in the appropriate 
units. The total flux is a topological invariant of the gauge 
field A , because you can make any smooth deformation 
of the gauge field over the surface — not just gauge trans- 
formations — and that integer would stay the same. This 
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total flux which equals the magnetic charge is a topologi- 
cal invariant characterising the gauge field on the surface 
and is called the Chern number. So, indeed, on the two- 
sphere the discrete values of the magnetic total magnetic 
flux label different topological sectors of allowed electro- 
magnetic fields. These topological features of gauge the- 
ories play an important role in many subfields of physics, 
for example in understanding the (integer) Quantum Hall 
effect. 


To appreciate the subtlety of the argument let us once 
more step back and see how it is (quantum) physics that 
dictates the result. This has to be the case because how 
else could Planck’s constant show up in the charge quan- 
tization fomula. That can’t be accidental! We see that we 
map the circle in real space Si into the gauge group which 
we was also a circle S}. The topological sectors are la- 
beled by the winding number of this map, telling you that 
Alp = 27) = 27n. The compactness of this group tells 
you therefore two things: (i) that the permitted charges are 
labeled by integers corresponding to the unitary represen- 
tations of the group, and (ii) that there are topological sec- 
tors corresponding to quantized magnetic charges. If na- 
ture had given us particles with arbitrary electric charges 
like re or ev/2 besides e itself, then that would have im- 
plied that the gauge group could not have been the com- 
pact U(1) but would have been the non-compact group R!. 
It's unitary representations are not labeled by integers, so 
there would be no charge quantization. But at the same 
time the argument for the existence of non-trivial topologi- 
cal sectors would also collapse. Any mapping of the circle 
St into a line are all contractable to a point, meaning that 
they are all topologically equivalent, and consequently that 
there is only one sector in the theory. The world would be 
without a discrete conserved magnetic charge: no mono- 
poles! 


As we will see later the state space of a qubit is also a 
three-sphere and we will also use the representation of 
the three-sphere as a bundle space in that context. We will 
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Figure 1.1.31: The charge-pole system. The charge-pole 
system is static but has a angular momentum nevertheless. 
The total angular momentum can be calculated to be equal to 
J= Ta , which with the Dirac’s quantization condition yields 


values J = (5,3,3---)h. 


return to and expand on these more extended geometrical 
and topological concepts in Chapter 1.2 in the section on 
The physics of geometry. We emphasize these topological 
features of our mathematical representations of physical 
systems, because after all topological data refer typically to 
class labels which are in many cases discrete. In a sense 
this is a form of quantization that is may be less familiar 
but certainly no less quantessential. Hi E 


A remarkable case of ‘static’ angular momentum. oe 


The system of a spatially separated electric charge — mag- 
netic monopole pair has a curious property first pointed out 
by J.J. Thomson and presented it as a problem in the Cam- 
bridge University Tripod exam in the late 1890s. In Figure 
1.1.31 we have depicted the situation with the charge and 
pole located on the z-axis. At two points symmetric with re- 
spect to the z-axis we have the electric and magnetic fields 
E and B, and the resulting Pointing or field- momentum vec- 
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tor S. The contribution to the angular momentum around 
the z-axis is clearly pointing along the charge-pole direc- 
tion. When we integrate all the contributions, we find that 
the total angular momentum is non-zero and in fact ex- 
actly equal to the quantized product of e and g values in 
the appropriate units. A static system with a non-zero to- 
tal angular momentum, a value that is quantized in half 
integral units and does not depend on the distance be- 
tween the two sources is remarkable indeed. We will re- 
turn to these properties in Volumes II en III where we dis- 
cuss the spin and statistics properties of particles in two 
dimensions. E 


Statistical Physics: 
from micro to macro physics 


This section is about macroscopic systems consisting of 
very large numbers of atoms or molecules and focusses 
on the link between microscopic and macroscopic behav- 
ior, between individual and collective (equilibrium) degrees 
of freedom. The physics of macroscopic phenomena ev- 
idently started as a phenomenologically driven discipline, 
and it followed the Newtonian approach, by applying an- 
alytical methods using differential equations to describe 
continuous media like gases, liquids and to some extent 
solids. It lead to a rich variety of equations for thermo- , 
hydro- and aerodynamics. A crucial turning point came 
with the acceptance of the molecular hypothesis, the real- 
ization that all forms of matter are made up of tiny mol- 
ecules. This posed a new challenge, namely to derive 
and explain all the known macroscopic physics starting 
from applying basic Newtonian mechanics on the molec- 
ular level. As one is not interested in the detailed behav- 
ior of the individual atoms, statistics serve as a powerful 
bridge between the incoherent individual dynamics and the 
often perfectly coherent dynamics of the collective. This 
led to a fundamental branch of theoretical physics called 


statistical mechanics, which is considered the third great 
achievement of classical physics. This approach allowed 
us to understand numerous so-called emergent phenom- 
ena - the properties of the collective that are not present on 
the level of the individual atoms. In this section we focus 
on thermodynamics: we will first give its macroscopic def- 
inition and its three basic laws, and then we will show how 
a Statistical physics approach enables a deeper and uni- 
fied understanding of the subject. The reason why we are 
focusing on thermodynamics is that it was within that field 
that the all-important concept of entropy as a measure of 
disorder and information originated. 


Thermodynamics: the three laws 


Thermodynamics is a general theory that started with the 
noble aim to systematically improve the performance of 
steam engines and the like, but has now also found no- 
table applications for less down-to-earth systems like black 
holes. A thermodynamical system — think of a fridge or a 
steam engine, or just an amount of gas in a container — 
can work and exchange heat or energy with other systems 
or its environment. Thermodynamics studies the relations 
between heat, energy and the ability of the system to do 
work. 


Thermodynamics is a macroscopic theory; nowhere does 
it refer to the specific microscopic structure of the system. 
However, when we introduce the subject, it is easiest to en- 
visage a simple gas in equilibrium in a container with par- 
ticular values for the macroscopic state variables, pressure 
P , volume V , temperature T as depicted in Figure 1.1.32. 
The fourth state variable, the entropy S, is more hidden as 
it provides a link between temperature and heat as we will 
see. This system has an internal energy U(T) which is the 
total energy of its internal degrees of freedom. 


The essentials of thermodynamics are expressed in three 
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Figure 1.1.32: Gas in thermal equilibrium. Gas in a container 
with a movable piston kept at a given temperature T and pres- 
sure P, yielding a certain volume V. The three state variables 
are not independent but satisfy an equation of state. For an ideal 
gas that relation is given in the next figure. 


famous laws. In fact there is a fourth law, which is usually 
referred to as the zeroth law of thermodynamics, presum- 
ably because it is considered to be self-evident. 


The zeroth law introduces the notion of thermodynamical 
equilibrium, and stipulates that it is a transitive property, 
that is to say that if system A is in equilibrium with B, and 
A is also in equilibrium with C, then B and C are also in 
equilibrium. This allows you to define the thermodynamical 
(absolute) temperature of a system. 


The first law is basically the statement that energy is con- 
served. This is expressed in a relation stating that adding 
some heat dQ to the system will result in an increase of the 
internal energy dU and the ability for the system to do me- 
chanical work, which for the gas in the container equals the 
pressure times the change in volume dW = PdV: 


du = dQ — PdV. (1.1.58) 


Figure 1.1.33: Ideal gas law. This graph shows the ideal gas 
law PV = kNT, expressing the dependence between the ther- 
modynamical variables P, V and T, with k the Boltzmann con- 
stant and N, Avogadro’s number. 


The second law is the most famous: it features the notion 
of entropy, denoted by S, which is defined by the following 
relation between heat and temperature: 

dQ = TdS. (1.1.59) 
This fundamental state variable of any thermodynamical 
system was introduced by Rudolf Clausius around 1850, 
as was the second law. The law states that for a closed 
system (say a fixed quantity of gas in a thermally isolated 
vessel) entropy can never decrease in time: 


dS 
— Z0. A. 
ae” (1.1.60) 


limS=0. 


1.1.61 
T-0 ( ) 


More precisely it goes to a constant which measures the 
ground-state degeneracy of the system. 


Entropy is a sort of measure for disorder: the law boiled 
down to the familiar phenomenon that (closed) systems 
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S=k log W i 


a 


he 


Figure 1.1.34: Ludwig Boltzmann's epitaph. The expression for 
the entropy S of a macroscopic state in terms of the number W 
of microscopic states corresponding to it appears as epitaph on 
Boltzmann’s tomb stone in Vienna’s Zentralfriedhof, where he 
was buried in 1906. (Source: Wikimedia.) 


have a natural tendency to become maximally messy or 
mixed. This applies to teener rooms as well as to tea parti- 
cles in a pot filled with hot water. As the entropy reaches its 
maximum value the system reaches an equilibrium state. 
So, if we put a droplet of ink in a bowl of water, and nei- 
ther change the amount of water, nor the temperature, nor 
the volume, then we still see the distribution of the ink mol- 
ecules through the water changing. In this process the 
entropy increases until the ink is completely mixed and 
distributed uniformly and equilibrium is reached. So in- 
creasing entropy, you could say, is linked to this process of 
increasing ‘disorder. To get a deeper understanding of the 
entropy concept it is necessary to include the microscopic 
structure of the system whatever that may be. This is our 
next topic. 


Understanding entropy. 


The fundamental expression for the entropy S of a given 
macroscopic state was derived by the great Austrian physi- 
cist Ludwig Boltzmann, who stated that it is proportional to 
the logarithm of the number of microscopic states W cor- 
responding to the macroscopic state under consideration. 
So 


S=klogW, (1.1.62) 


where k is not surprisingly called the Boltzmann constant. 
Now log W is a pure number and therefore k has units 
Joule/Kelvin. This famous expression was the precur- 
sor of the general notion of the information capacity of a 
system as the logarithm of the number of available states, 
as it was defined by Claude Shannon in his 1948 founda- 
tional paper on information theory. Shortly we will gener- 
alize the formula as to establish an explicit connection be- 
tween statistics and entropy. This relation between entropy 
and information theory will also be taken up again in the 
section The physics of information in Chapter 1.2. 


Context dependence of the entropy. To illustrate some 
features of the entropy concept, we start with some ex- 
amples of pure configurational entropy. Take a system of 
N boys and N girls that can be located in any of 2N po- 
sitions. If we furthermore assume that the macroscopic 
observer is pretty much blind and would have no possibil- 
ity of distinguishing between boys and girls, nor how many 
people sit at a given position. So there is no constraint on 
the configurations and there is only a single macro state. 
In this case the question is to count the number of pos- 
sible configurations of 2N people on 2N positions. Now 
we have to specify the conditions that the micro states 
have to satisfy. If the people were distinguishable (have 
names) then the number of possible (micro) states would 
be W; = (2N)? as any person can be in any of 2N po- 
sitions. If we assume they are indistinguishable, then we 
count a micro state where two people are interchanged 
as the same state, for 2N people we have 2N factorial 
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different orderings that count as one, and the number of 
configurations is therefore reduced by this number: W2 = 
(2N)*N/(2N)!. If we are now on the microscopic level, 
we could add the distinction between boys and girls, we 
can exchange the same gender only, and we have to re- 
place the 2N! with the much smaller number N! N!, yield- 
ing W3 = (2N)2N/(N!)2. Next we may add the constraint 
of exclusion meaning that only one person per position is 
allowed (they behave like fermions), which for the system 
at hand means that all positions are taken. With name 
identification the number of configurations is equal to the 
number of permutations of 2N names given by (2N)!. With 
gender identification only we identify the N! permutations 
of boys and girls separately, yielding W4 = (2N)!/(N!N!). 


The effect of resolution and/or constraints. What this lit- 
tle exercise conveys is that the definition of entropy is very 
much context dependent. Firstly there is the microscopic 
context of what the degrees freedom are that one wants to 
take into account and what the microscopic restrictions are 
(like distinguishability, exclusion etc.), and secondly there 
is the macroscopic context determined by what the macro- 
scopic observer is able to distinguish, resolve, or measure 
(names, gender, spatial compartments etc). So, in general 
the system has two levels and the entropy is a quantity that 
basically relates the resolutions (the set of observables 
and the precision with which these can be probed) and 
the constraints that determine which states are accessible 
at each level, and how these observables at the two levels 
are related. Again, in the examples given above, (i) there 
was only a single macroscopic state for any given N, and 
(ii) on the micro level we saw that more resolution leads to 
more states, while more constraints lead to fewer acces- 
sible states. In that sense within a given closed system, 
indeed, eliminating a constraint leads to ‘more disorder’ 
and also a larger number of accessible micro states and 
thus to an increase of the entropy. In the sequence above 
we have W1 > W- (less resolution), W2 < W; (more res- 
olution), and W3 > W; (adding a constraint). 


The common statement that ‘higher entropy means more 
disorder’ is actually quite subtle, and to get a better un- 
derstanding of this question we add one further structural 
element to the above example. 


Maximal entropy. We consider the previous system with 
2N positions, but take there to be two compartments, with 
N positions each separated by a gate, and in each position 
sits one person. The basic interaction is one where two 
people exchange position. We start with a special (histor- 
ically determined) initial state or configuration with all the 
red-haired girls on the left and the blue-eyed boys on the 
right. The boys and girls have no names, so exchanging 
two boys and/or two girls does not change configuration. 
This means that the initial strict gender separated configu- 
ration is a unique one: there is only one such state and it 
has minimal entropy S = kIn 1 = 0. Next we open the gate 
in the middle and boys and girls start mixing. | am vaguely 
suggesting that we are talking about a college dormitory 
complex in the 1950s, say with N = 10%. The level of 
frustration among students about the gender separation is 
like a temperature, and when that becomes high enough, 
the youngsters start jumping the fences everywhere to go 
coed. Nice analogy, but now you ask me why this col- 
lege only admits blue-eyed boys and red-haired girls. | 
haven’t thought about a suitable interpretation for this but 
no doubt there is one. Physicists, | fear, prefer to think 
of an ideal gas consisting of equal number of red and blue 
atoms where N = 10%. Let us now increase the resolution 
of the macro observer and assume that they can some- 
how measure the number n of boys/girls that are in the 
‘other’ compartment, so the macro-states are labeled by 
n. Now we ask how many microscopic possibilities there 
are to realize that particular macro state. The question 
is to distribute 2N youngsters over two partitions. Let us 
start with one boy/girl jumping the fence: the boy and girl 
can each come from any of N positions, so for state with 
n = 1 we have N x N = N? possible configurations (or 
micro-states). In the second cross-barrier move, the boy/- 
girl has only N — 1 positions to come from or go to, which 
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Figure 1.1.35: Gender mixing. The initial state is the one with 
all red-haired girls in the left and all blue-eyed boys on the right. 
When we let them interact through some random girl/boy ex- 
change mechanism, the entropy will increase; equilibrium is 
reached when the left and right colors have become equal. 


means there are [N(N — 1)]? possible n = 2 configura- 
tions, but now we over-count configurations: we should not 
have counted the gender neutral exchange of two boys or 
of two girls as different, so we still have to divide by a fac- 
tor 4. What we see is that the number of configurations 
increases extremely rapidly as function of n. The general 
answer is not too hard to understand from the previous ex- 
amples: 


(1.1.63) 


where the notation stands for the binomial coeffi- 


cient N over n. Note that this function is symmetric under 
interchange of n and (N — n). Furthermore we observed 
that the function increases for growing n, then these ob- 
servations imply that the maximum for Wn is achieved for 
n = N —n = N/2. Thus, if all micro-configurations are 
equally probable, the macro-state with n = N/2 has the 


largest number of possible micro-states and therefore the 
largest entropy: 


Smee =Kdn Wye =2kin| (N, )]. 

This means that if we let the random dynamics run for a 
sufficiently long time from any initial macro-state, and we 
then probe the system, that we will almost certainly find a 
configuration with n = N/2. So the remarkable insight we 
gain is that a random process drives the system to a par- 
ticular macroscopic state, namely the state that is the most 
probable because it has the most microscopically distinct 
realizations, which is the state with the highest entropy. 
And this is the second law of thermodynamics at work. A 
system has the natural tendency to move from a less to 
a more probable macro state. That state is the maximally 
mixed and therefore maximally disordered state that is ad- 
missible. 


This process is schematically illustrated in Figure 1.1.35. 
To give you a feeling for the numbers involved we have 
listed the binomial coefficients for various modest values 
of N and n in Table 1.1.1. If N is large, we may approxi- 
mate the logarithm of a factorial using the famous Stirling 
formula, which says that, In N! ~ N In N. With this for- 
mula one can show that S in the above equation is well 
approximated by Smax ~ 2NIn2, and this in turn implies 
that in equilibrium the number of micro-states is a number 
with roughly 0,6N digits. This implies that the probability 
for finding the completely gender-separated initial macro- 
state when the system is in equilibrium is of the order of 
Po = 10-°°N . If you take into account that N is of the or- 
der of Avogadro’s constant ~ 107°, po is extremely small 
indeed. 


The arrow of time. There is something rather profound 
going on in this red-blue dynamics. If you look on the 
micro-scale all the interchanges are equally probable. In 
fact any move is its own inverse, and therefore the micro 
dynamics is invariant under time reversal. If you would re- 
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Table 1.1.1: The binomial coefficients. We have listed some val- 
ues of the binomial coefficients ( N ) to demonstrate the steep 
increase as a function of n on the left, and the maximum value 
of the distribution as a function of N on the right. 


verse the time direction you wouldn’t see the difference in 
individual moves. On the microscopic level time has no 
direction! Interestingly, if you look at the macroscopic be- 
havior it clearly has a time direction (namely defined by 
the increasing entropy) that is not an abstract something or 
other, this is directly observable. From a macroscopic point 
of view, when N is large, you see the red compartment 
slowly turning blueish and the blue compartment slowly 
turning reddish, but the process stops at a point where 
both halves have acquired the same purple color. So, 
somehow the system has created its own arrow of time, 
whatever macro-state you start with it will always move to- 
wards the uniform purple color distribution with maximal 
entropy. 


This ‘coarse graining’ mechanism (see Figure 1.1.36) lies 
at the basis of the time arrow in the real world as well, be- 


Figure |.1.36: Coarse graining a portrait. We average the color 
content over larger and larger (overlapping) squares. In the 
‘blurring’ process the image looses resolution and is therefore 
hiding ever more information content. The entropy is increas- 
ing and the process is irreversible. The entropy is a measure 
for the amount of micro level information that is ‘hidden’ for the 
macroscopic observer. 


cause, as we have shown in the previous sections, both 
Newton’s and Maxwell’s equations are time reversal invari- 
ant if the interactions are. What this means is that given a 
solution to the equations, turning the time around, mean- 
ing that we replace t by —t, also produces a solution (but 
may be a different one). On a microscopic level playing 
the film backward would show another, equally acceptable 
sequence of events, but on a macroscopic level this is not 
true. If | drop my bowl of yogurt, fruit and granola on the 
floor, showing that sequence of events in reverse order, it 
may be hilarious but it is certainly not of this world. Indeed, 
this elementary example teaches us that the direction of 
time is emergent, since it has everything to do with the rel- 
ative number of micro-states belonging to a given macro- 
state. A randomly propagating system tends to move from 
a less to a more probable state, and reaches equilibrium 
in the most probable maximal entropy state. 
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Two cultures. The second law of ther- 
modynamics paradoxically owes part 
of its fame to the fact that it is so lit- 
tle known. This was poignantly pointed 
out by the author (and physicist) C.P. Snow in his 
provocative essay entitled Two cultures published 
in the New Statesman in 1954, in which he bit- 
terly complained about the scientific illiteracy of the 
cultural elite, and where he used the manifest ig- 
norance about the second law of thermodynamics 
(which in his opinion had a cultural importance com- 
parable to the works of Shakespeare) as a criterion 
to underpin his criticism. Let me say that Snow’s in- 
tervention on behalf of thermodynamics did not turn 
Boltzmann into a Shakespeare. Some years later, 
however, it did at least provoke a strongly worded 
reaction from the literary critic F.A. Leavis making 
the mutual incomprehension even more acute. In a 
remarkable piece of word craft Leavis stated: ‘Snow 
doesn’t know what he means, and doesn’t know he 
doesn’t know.’ ‘The intellectual nullity’ he added, ‘is 
what constitutes any difficulty there may be in deal- 
ing with Snow’s panoptic pseudo-cogencies, his pa- 
rade of a thesis: a mind to be argued with — that is 
not there; what we have is something other. ‘But 
what else to expect from a crappy writer like Snow?’ 
‘As a novelist, wrote Leavis, ‘he doesn’t exist; he 
doesn’t begin to exist. He can’t be said to know 
what a novel is.’ 

The sad point about the situation described by 
Snow is that it has barely changed over the past 
half century. So don’t ask friends to recite the sec- 
ond law in public, your popularity will most probably 
instantly plummet. L 


This being said we should be cautious, in any given system 
there will be fluctuations where the entropy actually de- 
creases. The micro-dynamics do not preclude such moves, 


but on average it is not possible. 


It is an awesome idea but certainly correct that in the sys- 
tem we just studied, there is a non-zero albeit inconceiv- 
ably small probability for the system to pass through the 
same initial state again! 


But that was a state with a lower entropy! The existence 
of such a recurrence time was proven by Henri Poincaré in 
1890. A rough estimate for this recurrence time will be of 
the order t ~ 10N = 10!°”* sec, which is of the order of 
10% times the age of the universe. In whatever units you 
like to express this truly dazzling number, it is evident that 
this recursion is not an event to just sit-and-wait for! 


This amusingly may remind you of the problems that peo- 
ple who have no understanding of statistics and probability 
encounter. Events, like the spontaneous gender separa- 
tion under the given random dynamics in our example, is 
logically not excluded, but it would take for ever! Assigning 
outrageously large probabilities to events which are log- 
ically not excluded but highly improbable is a specialty of 
so-called conspiracy theorists. Indeed, it would take a con- 
spiracy of extreme proportions to realize such super im- 
probable events, like having all air molecules accumulate 
in one tiny corner of the room, and you dying because of a 
lack of oxygen. 


Statistical mechanics 


The molecular hypothesis. A major step forward was the 
acceptance of the molecular hypothesis, implying that all 
matter is ultimately build up from microscopic, molecular 
or atomic constituents. One of the strongest protagonists 
for this hypothesis was Ludwig Boltzmann. For the number 
of particles in such macroscopic systems the scale is set 
by the constant of Avogadro of the order of 6 x 107° the 
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number of atoms in a mole of some gas,® a number that 
makes even strong people quiver. This molecular perspec- 
tive raised the fundamental challenge for physicists to es- 
tablish an explicit connection between microscopic phys- 
ics (mechanics and electromagnetism) and the aforemen- 
tioned macroscopic laws. The molecules obey the classi- 
cal laws, and one - pretty naive - way to think about ad- 
dressing this challenge would be to face the problem head 
on and try to solve ~ 107° coupled Newtonian equations for 
the individual particles simultaneously. Hmm, apart from 
the computational power needed, this doesn’t sound like 
a very smart idea, does it? Particularly since we are not 
at all interested in the precise behavior of every individual 
particle. 


Statistical approach. A successful approach is the sta- 
tistical one, where one links the macroscopic properties 
like pressure, temperature and entropy to certain average 
properties of the collective of molecules. Indeed, with such 
huge numbers statistical methods become extremely pow- 
erful and precise as any insurance company can tell you. 
What properties of the molecular collective could be mean- 
ingfully lifted to relevant variables at the macroscopic level? 
These would typically be the conserved quantities like en- 
ergy, momentum and particle number. The energy is con- 
served and for a closed system would be just an additive 
quantity: the energy of the macroscopic system is just the 
sum of the individual particle energies and their interac- 
tions. The total energy is rigorously conserved: in other 
words, constant. 


Open and closed systems. One is not limited to closed 
systems, and one might also consider an open system that 
is coupled to an energy reservoir kept at a fixed tempera- 
ture (also called a ‘heat bath’), which means that one al- 
lows for energy (heat) flows between the system and the 
reservoir as we depicted in Figure 1.1.32. If we raise the 


8 As explained in Chapter 1.3, a new definition as of May 20, 2019, 
of Avogadro’s number or constant sets it exactly equal to Na = 
6.02214076 x 10%. 


temperature of the reservoir, heat will flow to the system, 
raising the internal energy and allowing it to do a certain 
amount of work. And this gives you an idea of how the 
first law of thermodynamics can be derived from the mi- 
croscopic laws. In other words, temperature is the exter- 
nal parameter that sets the average energy of the system, 
and in that sense imposes an external constraint on the 
system. For the particle number an analogous reasoning 
holds. Here one may couple the system to a particle reser- 
voir which is kept at a fixed chemical potential u. This po- 
tential corresponds to the energy it costs to add one more 
particle to the system. These considerations can be made 
very precise and are part of the field of statistical mechan- 
ics, developed by physicists like Boltzmann, Maxwell and 
Gibbs. 


Equipartition of energy. One can show that for a system in 
equilibrium, on average, the energy is equally partitioned 
over the individual particles, which means that the notion 
of temperature is linked to the average energy per particle 
in the system. In fact the correct way to say this is that 
the energy is equally distributed over the degrees of free- 
dom, where for a system in equilibrium at temperature T, 
each degree of freedom gets an energy (£i) = kT/2. A 
particle in three dimensions has three independent veloc- 
ity components and therefore three degrees of freedom. 
Consequently, for a system of N particles the average en- 
ergy will be (E) = 3NkT/2. 


Phase space representations of a multi-particle sys- 
tem. Imagine we have a gas that consists of N identical 
particles in a volume V, then there are two distinct phase 
space representations of the system possible. One is rele- 
vant if one wants to study the average single particle prop- 
erties or (auto)correlations and refers to the one-particle 
phase space, while the other concerns the distribution over 
different multi-particle micro-states that correspond to a 
single macro-state. 
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y - space. Let us start with the one-particle phase space 
y = (x,p), and represent the state of each particle in 
the system as a point. This yields a certain density of 
points, corresponding to a distribution f(y, t).? If the sys- 
tem is in equilibrium, then we expect: (i) the particles to 
be uniformly distributed in ordinary x‘-space, (ii) the distri- 
bution to be time independent, and (iii) the momentum de- 
pendence to be isotropic. This tells us that in equilibrium 
f(x, p, t) — f(lp|), which gives rise to the famous Maxwell 
— Boltzmann distribution, which is a Gaussian distribution 
in p space with the exponent equal to minus the energy: 
—e/kT = —p*/2mkT. Why the exponential energy sup- 
pression factor? There are two elementary requirements 
which make this plausible. If for a simple system like an 
ideal gas where the particles do not interact and are inde- 
pendent, we look at two particles, then the joint probabil- 
ity to find one of them with pı and the other with p2, we 
would just be the product of the one-particle probabilities: 
f2(p1, p2) = f(pi)f(pz2). In other words the two-particle 
configuration should then be weighted by the total energy 
which is the sum of the two energies. This should hold 
for any partitioning of non-interacting components which 
means that the exponential factor is the unique answer, be- 
cause by multiplying two exponentials the exponents add. 


T-space. We can also define the phase space for the 
whole system, that total phase space is defined as the 
Cartesian product of the N individual spaces. This multi- 
particle phase space Ty = {(x)N, (p)N}, of N coordinates 
and momenta, is 6N-dimensional, as each particle has 
three position and three momentum components. This is 
a very high-dimensional space, and at any given instant 
the system as a whole is represented by a single point in 
that space. The particles will bounce around which means 
that the point representing the system will move around in 
that space and to study the macroscopic properties of the 
system we would have to consider long-time averages of 


°| refer readers who are not familiar with the basics of probability 
theory to the Math Excursion ‘On probability and statistics’ on page 
626 of Part Ill. 


those properties. Clearly variables defining macro-states, 
like for example the total energy, define a constraint on the 
micro-states, which means that these variables will define 
certain subspaces or strata in T. The micro-states in such 
a domain can be quite different but cannot be macroscop- 
ically distinguished. 


Ergodicity. A basic assumption of statistical mechanics, 
called the ergodic principle, is that we can replace the time 
averages of the system with T-space averages using the 
appropriate distribution representing the equilibrium micro- 
states. The principle is supposed to hold in the thermody- 
namic limit, where time, volume and the particle number 
go to infinity (keeping n = N/V fixed). 


In this setting one may with a single equilibrium state of the 
macro-system associate a stationary distribution of points 
corresponding to the probability for the different micro-- 
states representing that macro-state to occur. One intro- 
duces a weight function p(T) which may depend on ex- 
ternal parameters like temperature or particle number that 
represent the macroscopic conditions one imposes. Now 
p(T) defines what is called an statistical ensemble of micro- 
states. If the system is closed (fixed total energy), we 
speak of the micro-canonical ensemble. if we couple it 
to a thermal bath, we have the canonical ensemble with 
weight function, p(T) = e'("/KT, where H(I) is the en- 
ergy function (Hamiltonian) for the multi-particle system. 
If we also let the number of particles N vary, we get the 
grand canonical ensemble. It was the American physicist 
Josiah Willard Gibbs who introduced the notion of an ‘en- 
semble’ of micro-systems, and the ‘ensemble distributions; 
to calculate the desired averages in all types of macro- 
states. 


May be to illustrate these rather abstract notions it helps to 
extend our red-eyed/blue-haired, excuse me red-haired/- 
blue-eyed youngster model to include variables like ‘money’ 
and ‘group size. Clearly the group size is just the number 
N, we introduced before and we could make it a variable 
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by coupling to a reservoir of similar pairs who are allowed 
to join. The amount of money would be the social equiv- 
alent of energy, and in a closed system money would be 
conserved, people could exchange money as long as the 
total amount of money stays conserved. 


If you don’t like the analogy, you certainly have a point: 
whereas in the world of particles there is such a wonder- 
ful thing as the equipartition of energy, that is to say that 
on average every particle has an equal energy, the same 
does not to seem to hold in the world of money. It’s quite 
the opposite: we witness a process of wealth accumula- 
tion. This is a non-equilibrium situation which tends to re- 
sult in a macabre final state where presumably one person 
owns all the money. In this case one could speak of the 
capitalist singularity whereas for the particles one ends up 
with a socialist uniformity. In this analogy the thermal bath 
would be represented by the central banks who can raise 
the fiscal ‘temperature’ by printing money. | invite the am- 
bitious reader to think about how to include taxation in the 
model. What these analogies try to convey is that for all 
these systems there is a notion of a phase space, of exter- 
nal parameters and a statistical ensemble that describes 
the probability distribution of micro-states depending on an 
external parameters. 


The partition function. The partition function of a many- 
body system is now defined as a phase-space integral, 
L = fi e(T) dl. You could say that the partition function 
gives the ‘volume’ of the domain in [—space, correspond- 
ing to the external (macro) parameter choices made in p. 
For example, with p describing the canonical ensemble, 
for a system in contact with a heat bath kept at a tempera- 
ture T, the partition sum would depend on T as an external 
parameter. 


Emergence. Let us also point out another interesting fea- 
ture of this statistical approach to systems consisting of 
many degrees of freedom (particles). In many ways this 
perspective allows one to introduce ‘mean fields’ as an 


approximation to the many body system that underlies it. 
One passes from a corpuscular perspective to a contin- 
uous one. From the macroscopic point of view, a water 
flow in a river would be described by a mass density field 
p(x, t), a velocity field v(x, t), and an energy density or 
temperature field (x,t). These continuous fields are de- 
fined by smearing out the local average properties of many 
particles. You may say that this assumes the existence 
of a local equilibrium in the system. One may show that 
these local fields have to obey certain specific dynamical 
field equations called the laws of hydro-, aero- or plasma- 
dynamics. These field equations follow from averaging the 
continuity equations for the locally conserved quantities of 
the interacting micro system. The resulting laws are ‘emer- 
gent’ and describe approximately many novel so-called 
emergent collective properties, in the case of water, you 
should think of waves and vortices.!° Water waves are a 
phenomenon of which the individual water molecules have 
no idea, the wave property is not present at the constituent 
level, and it is in that sense that people like to say that 
the ‘whole is more than the sum of its parts.’ And it is for 
that reason that water waves are called an ‘emergent’ phe- 
nomenon. In the simple red-haired-girls/blue-eyed-boys 
model, we saw the arrow of time emerging, and the emer- 
gent (phenomenological) law was telling us that the two 
colors would uniformly change to the same color purple. 


Statistical thermodynamics. I 


Let us return to thermodynamics. In the statistical ap- 
proach to a system in thermal equilibrium, say, a fixed 
quantity of gas in a container that we keep at a fixed tem- 
perature T, we think of the macro-states labeled by the 
thermodynamic state variables P, V, S,N and T. In this 
situation heat can flow from and to the heat reservoir, which 


UIt is striking to see that Maxwell himself believed that his own elec- 
trodynamics was an effective description of the collective behavior of 
an underlying molecular world. As we will see later on, the quantum 
theory of fields is in a certain way a vindication of that point of view. 
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means that in thermal equilibrium the energy of the mi- 
croscopic system is not constant. It will typically fluctuate 
around the thermal average U = (E) = 3NkT/2. The rel- 
evant energy variable is the (Helmholtz) free energy which 
is defined as: 


F=U-TS, (1.1.64) 


and should be thought of as a function of T and V, because 
it follows from the first law that a change in the free energy 
is given by 


dF = dU — SdT — TdS =—PdV—SdT. (1.1.65) 


Note that from its definition, minimizing the free energy 
combines the natural tendencies to minimize the internal 
energy U and maximize the entropy S. 


Let us consider a simple discrete model where each macro- 
state corresponds to a well-defined set of different configu- 
rations on the microscopical level called micro-states. This 
example aims to illustrate how the link between micro- and 
macro-physics is established. These micro-states are la- 
beled by an index ‘i’ and each have a certain energy Ei. 
The probability p; that a micro-state occurs is again pro- 
portional to the Boltzmann weight wi = exp(—Ej/kT), 
which says that the high-energy states are exponentially 
suppressed. 


We may then write that the probability is: 


e-Ei/kT 
i= j 1.1.66 
P Z ( ) 
where Z is the partition sum defined as 
(1.1.67) 


Z= 5 eh, 
i 


Note that the sum of all probabilities indeed equals one. 
The link between the macroscopic and microscopic states 
is established by giving the expression for the free energy 
in terms of the partition sum: 


F =—kT InZ. (1.1.68) 


From this relation the thermodynamical quantities can be 
derived. For example with this link it is possible to calculate 
the famous expression first derived by Gibbs, for the en- 
tropy in terms of the probability distribution. Subsequently 
using equations (1.1.68) and (1.1.66) we obtain 


F=9 piF=-kT) pilnz 
i i 
=-KT) pt —Inpi) = > piEt+kT) pilnpi. 
i i i 


Given that by definition U = (E) = } ; piEi, we find 
from (1.1.64) that the entropy can be expressed as 


S=-k) pilnpi. (1.1.69) 
t 

This is the famous expression for the entropy due to Gibbs 
which was (re)derived by Shannon, and being the formal 
definition of information (entropy), forms the basis for in- 
formation theory. At this point it is important to empha- 
size the remarkable generality of this result, as it assigns 
an entropy or information capacity to any given probability 
distribution or statistical ensemble. 


Note that in equation (1.1.69), for a isolated system with 
fixed energy (not in contact with a heat bath), the energies 
E; become equal, and thus pi = p = 1/W. This repro- 
duces the Boltzmann result (1.1.62) for the entropy. As- 
signing equal probabilities is like saying that you have no 
a priori information about the states, so you are not impos- 
ing any constraint, and thus you get the maximum value 
for the entropy, the one given by Boltzmann. There is a 
formal, less physics restricted, method for constructing the 
maximal entropy distribution as defined in equation (1.1.69) 
which allows for the systematic inclusion of additional con- 
straints or prior knowledge. This is called the maximal en- 
tropy principle and is further discussed in the Math Excur- 
sion ‘On probability and statistics’ at the end of Part Ill on 
page 626. E 


The energy distribution. To further elaborate on the sta- 
tistical interpretation of thermodynamics, it is illuminating 
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to look at the energy variable and to derive the energy 
weight function s(E) from p. In the integral over the en- 
semble of all micro-states, we break the integral up into 
subsets of equal energy where state i and j belong to the 
same subset if E; = Ej = E. We call n(E) the volume of a 
thin shell at energy E .This allows us to write the partition 
function over all micro-states as 


Z = [sŒ dE = | nceye em dE 


len +InnlE) ge, (1.1.70) 


It is illuminating to go through this calculation for the simple 
case of an ideal gas, as we will do next. 


IS 
Let us consider the ideal gas to show how explicit expres- 
sions for the thermodynamical functions in terms of micro- 
physical variables can be obtained by using statistical me- 
chanics. We have N particles in a container with volume V 
in thermal equilibrium at a temperature T . The total inter- 
nal energy of a configuration, given by E, equals the sum 
over one particle kinetic energies: E = )_,(p2)/2m. To 
get to the energy distribution we have to integrate (or sum) 
the general phase space distribution p(T) over all 6N vari- 
ables except the total energy. In an equilibrium state the 
spatial distribution is uniform and therefore integrating all 
the coordinates gives a factor VN. The integral over the 3N 
momenta components has to satisfy the energy constraint 
that the total kinetic energy equals E. All 3N—dimensional 
momentum vectors that satisfy this condition have a length 
Ip] = V2mE. . So the integral yields the area of a (3N — 1)- 
dimensional spherical surface of a 3N—dimensional ball of 


radius R = V2mE. This means that the density of states 
takes the form: 


The ideal gas. 


n(E) = CyVNE2N , (1.1.71) 


where we have dropped a negligible term equal to 1/2 in 
the exponent. The constant Cy is the area of the (3N —1)- 


s3N-1 


Figure |.1.37: Phase space distribution. The rapidly decaying 
density of points in phase space (in blue). A fixed energy surface 
(in red) is a very high-dimensional spherical surface. Adding up 
the points in a narrow shell yields an extremely steeply rising 
function n(E). 


dimensional unit hypersphere.!! 


At this point we can make the connection with thermody- 
namics, by noting that the entropy of the system is given 
by: 


S(E, V) =kInn(E) = kin Cy +kNIn(VE 2). (1.1.72) 


Solving this equation for the internal energy U(S, V) = E 
yields: 


(1.1.73) 


If we now use the first law in the form dU = TdS — PdV, 


"The actual expression, which does not enter our considerations, is: 
Cn = 3N(m)2N/(3N)!. 
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we can determine T and P: 


au 2U 
(35),-T- — (1.1.74a) 
au 2U KNT 
of OE), ape ce 11.74 
(sv), 3v vV Ea 


The first equation gives the familiar expression relating the 
internal energy to the temperature and should be read 
here as a definition of the temperature in terms of the 
micro-state energy. From this we may also get the ex- 
pression for the specific heat denoted as c,, which is the 
energy needed to raise the temperature by one degree. It 
is defined as (0U/0T)y, which in this case yields: c, = 
3Nk/2. The second equation gives the equation of state 
for the ideal gas, better known as the ideal gas law PV = 
RT , where the universal gas constant R is defined as R = 
Nk. It is an equation of state because it relates the three 
different thermodynamic state variables P, V, and T. It 
defines a constrained surface of allowed thermodynamic 
states, in the space of these three state variables. This 
basically concludes our first principles derivation of some 
high-school formulae that apply to the ideal gas. 


It is instructive to reflect a bit more on the overall energy 
weight function s of equation 1.1.70. On the one hand, we 
know that the density of points in the space drops expo- 
nentially because of the Boltzmann factor. However, the 
‘volume’ n(E) of the layers grows extremely fast like E°‘/?, 
because of the huge value of N. The overall weight, being 
the product of the two functions, becomes 


s(E, N) ~ Cu VNEŻNe7E/KT , (1.1.75) 


To determine the maximum of s (E, N), we set its derivative 
equal zero: 


Os 3N 1 
ER = GE = pp) sEm N) =). 


This yields the value Em ~ 3NkT = (E), confirming our 
expectation that for a very narrow and highly peaked func- 
tion one expects the maximum and the average to coin- 
cide. 


(1.1.76) 


density of states n(E) weight function s(E) 


E/kT 


Boltzmann weight |e 


s(E) = n(E) e7&/KT 


Figure 1.1.38: Energy weight function. There are three curves, 
one represents the Boltzmann exponential suppression factor in 
blue, The density of states n(E; N) in red, and their product, the 
energy weight function s(E;N) in purple, are plotted near the 
origin for N = 8. 


In the Figures 1.1.38 and 1.1.39 we have illustrated how the 
resulting weight function s(E) (in purple) emerges as the 
product of the very steeply rising entropy driven density 
of states n(E) (in red) and the exponential energy sup- 
pression (in blue). We have plotted the case where N = 
8, which is not quite representative! Indeed it is strik- 
ing that a narrow peak results: on the left the peak is 
driven high up by the degeneracy or entropy factor n(E), 
and on the right it is forced down again by the energy 
dependent exponential suppression factor. For large N 
the position of maximum grows proportional to n: Em ~ 
N, its maximum height increases exponentially: s(Em) ~ 
(const.)\ , while the width grows only with the square 
root: AE ~ VN. So for large N the relative width de- 
creases like AE/Em ~ 1/VN, and this implies that the 
weight function becomes proportional to a narrow Gaus- 
sian or rather a delta function. And this means that the 
essential behavior is very well represented by the narrow 
red band (the hyper-spherical shell) we have drawn in Fig- 
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Figure 1.1.39: Ensemble energy weights. The weight function 
s(E; N) is the integrand of the partition function. Its maximum 
increases like ~ (const.), the location of the maximum grows 
~ N, while the width grows only as VN. In the limit of very large 
N, s(e) becomes proportional to a delta function. The micro 
states that matter sit all in an extremely narrow energy band as 
indicated in Figure 1.1.37. 


ure|.1.37. Effectively these estimates also show that the 
energy fluctuations in the canonical ensemble will be very 
small, which in turn means that effectively the canonical 
and micro-canonical ensembles are equivalent if we choose 
E= Em. E 0 


Classical versus quantum probabilities. We have cho- 
sen to highlight this statistical approach to classical many- 
body physics because we will see that also quantum the- 
ory is probabilistic and statistical at heart. And the com- 
parison of the classical statistical physics perspective with 
the statistical aspects of quantum theory is illuminating. 
In quantum theory the probabilistic interpretation is forced 
upon us right from the start at the level of a single parti- 
cle, and is encoded in the ‘wavefunction’ description of a 
quantum particle. The wavefunction is a ‘probability am- 
plitude, and its absolute square represents a distribution. 


That distribution gives the probability density p(x, t) to find 
the particle at position x at time t, and in that sense it has 
some mathematical resemblance to the case of statistical 
mechanics, where the canonical distribution for example 
gives the probability p(E;) to find the many body system 
that has an energy Ei. 


There is, however, a fundamental difference between clas- 
sical and quantum probabilities; in classical physics a prob- 
ability generally reflects a lack of knowledge about the sys- 
tem, which we in principle could eliminate by making more 
precise measurements. In quantum physics it reflects a 
fundamental indeterminism, meaning that even if we have 
complete knowledge of the quantum state, a property like 
the spin component along a certain axis for example need 
not be uniquely fixed. In spite of this difference in inter- 
pretation we will see that there are numerous mathemati- 
cal concepts that can be carried over from statistical phys- 
ics to quantum theory, and (information) entropy is one of 
them. 


The path integral formulation of quantum theory. From 
a fundamental perspective a profound yet very direct rela- 
tion between quantum and classical physics is established 
through the framework of the (Euclidean) path integral for- 
mulation of quantum theory proposed by Feynman follow- 
ing an idea of Dirac. The fundamental entity in quantum 
theory is the probability amplitude Ai¢ for the system to go 
from an initial state labeled i to a final state labeled f. The 
probability pis for the transition to take place is then given 
by the square: Pir = |Aiş|?. The amplitude for a quan- 
tum particle to go from A at time t; to B at time tẹ can 
in general be written as a weighted sum over all possible 
paths L(t) in classical configuration space that satisfy the 
boundary conditions L(t;) = A and L(t) = B. As it in- 
volves the integration of all possible classical paths or field 
configurations, the mathematics is quite complicated and 
in many cases lacks a rigorous mathematical foundation. 
Yet it is a powerful method that in many ways shows strik- 
ing mathematical parallels to statistical mechanics if one 
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makes some substitutions like replacing the energy func- 
tion with the action functional in the statistical weight, the 
temperature by the product h and some coupling. The no- 
tion that the ‘free energy’ is equal to the log of the partition 
function translates in the statement that the ‘effective ac- 
tion’ is the log of the unconstrained path integral over clas- 
sical configuration space. We return to this topic towards 
the end of the book in Chapter III.4, after we have gained 
more familiarity with the quantum world. 


Conclusion. Our guided tour along some of the highlights 
of classical physics has come to an end. To conclude this 
first chapter, we observe that towards the end of the nine- 
teenth century, many physicists thought that the physical 
universe was basically fathomed, with only minor details 
remaining to be settled. The fundamental laws had been 
laid down by a bunch of geniuses and the program was re- 
duced to merely applying them, skilfully applying them to 
be sure. That appeared to be a matter of diligent devotion, 
more something like stamp collecting than facing the chal- 
lenge of building another Rome in one day... 

Indeed, mission almost completed, but as we will see, not 
quite. Stated differently: Hell was about to break loose! 


Further reading. 
Some introductory textbooks on classical physics: 


Classical Dynamics of Particles and Systems 
S.T. Thornton, J.B. Marion 
Saunders College Publications; 4th edition (1995) 


Classical Mechanics 
J.E. Taylor 
Cambridge University Press (2008) 


Electricity and Magnetism 
E.M. Purcell 
Cambridge University Press; 3rd edition (2013) 


Introduction to Electrodynamics 
D.J. Griffiths 
Cambridge University Press; 4th edition (2017) 


Fundamentals of Statistical and Thermal Physics 
F. Reif 
Waveland Press Inc (2008) 


An Introduction to Thermal Physics 
D.V. Schroeder 
Oxford University Press (2020) 


Complementary reading: 


The Equations 
S. Bais 
Harvard Univiversity Press (2005) 


Chapter I.2 


The age of geometry, information and quantum 


And the continuity of our science has not been af- 
fected by all these turbulent happenings, as the 
older theories have always been included as lim- 
iting cases in the new ones. 

Max Born 


In spite of the prevailing scientific optimism towards the 
end of the nineteenth century, some of the most radical 
changes in our thinking about the workings of nature were 
about to surface. The monumental edifice of classical phys- 
ics started to show cracks which would turn out to be fa- 
tal. The crisis in this would-be infallibility centered around 
some phenomena that were not just hard to explain but 
were in manifest contradiction with the cherished classical 
dogmas. The limited domains of validity of classical phys- 
ics became apparent through the turning points of relativity 
and quantum theory. 

This chapter aims to provide a broad perspective on the 
new opportunities that opened up for science and technol- 
ogy in the twentieth century, and were derived in some way 
or another from the turning points that occurred early on. 
The subsequent sections cover introductions to the phys- 
ics of relativity, the physics of geometry, and the physics of 
information. We conclude this chapter with some general 
remarks on quantum theory. 


Canaries in a coal mine 


Challenges, contradictions and tuning points. It is in- 
teresting to note that already towards the end of the nine- 
teenth century, there were some rather well-known exper- 
imental observations that seemed to challenge aspects of 
the central dogmas of classical physics. We may call these 
the canaries in the coal mine. Let us start with two results 
that were puzzling at the time and were only resolved by 
the radical shift in perspective caused by the theories of 
relativity, though Einstein himself never emphasized them 
as sources or motivations for his work. Then we move on 
to puzzles that pushed us toward quantum theory. 


The Michelson—Morley experiment. This experiment suc- 
ceeded in measuring the effect of the so-called ether (an 
all-pervading medium through which classical electromag- 
netic waves supposedly would propagate) on the propaga- 
tion of light. A non-zero effect was anticipated because the 
earth would be in motion with respect to the ether and this 
would cause some dragging of the light in the direction 
of the relative motion of the ether. Light would therefore 
propagate at different velocities in different directions. The 
measurements of Michelson and Morley showed, however, 
that there was no such effect, leading to the conclusion 
that the ether was a delusion. It was Einstein who abol- 
ished the idea of an ether in his special theory of relativity 
of 1905. 
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The (anomalous) perihelion precession of Mercury. It 
had been observed as early as 1860 that the elliptical or- 
bit of Mercury as a whole rotated very slowly in its orbital 
plane. This was a problem that even Newton’s laws could 
not account for, even when perturbations like the other 
planets (even assuming the existence of a novel planet 
named Vulcanus), as well as the oblateness of the sun 
were taken into account. But it turned out that the ob- 
served anomalous part of the precession agreed to a high 
precision with the calculation using the general theory of 
relativity, the new theory of gravity formulated by Einstein 
in 1915. The anomalous perihelion precession thereby fur- 
nished one of the earliest experimental confirmations of 
general relativity. 


Let us now turn to four early puzzles that could only be 
resolved with quantum theory. 


The black body radiation law. If we heat a body, it starts 
to radiate. For a black body kept at a given tempera- 
ture the classical formula describing the radiation inten- 
sity as a function of frequency due to Rayleigh and Jeans 
failed to describe the data, and in fact predicted an un- 
physical limit towards the high frequency end of the spec- 
trum referred to as the ultraviolet catastrophe (see Fig- 
ure |.2.1(a)). This all came about because one applied the 
classical equipartition of energy among the various modes 
of the electromagnetic field. The resolution of this problem 
by Max Planck in 1900 was based on the bold assump- 
tion that the minimal energy of a mode E is equal to the 
frequency v times a fundamental constant denoted by h, 
according to his famous formula: 


E=hyv. (1.2.1) 


It is here that the proportionality constant h named after 
Planck entered physics as a new universal constant of na- 
ture. It is extremely small, in ordinary units the reduced 
Planck’s constant — called h-bar — equals 


R=h/2n=1.05x 10-4 Joule seconds. (1.2.2) 


This tiny constant had a huge impact, since this innocent 
looking quantization formula marked the very beginning of 
the tumultuous quantum era. 


The classical radiation formula can be obtained from the 
quantum formula by taking the limit where h tends to zero, 
and in that sense quantum theory clearly marks the limited 
domain of validity of its classical predecessor. 


The structure of the atom. It was known at the time that a 
gas of atoms of a particular type, like hydrogen or sodium, 
would absorb or emit light with a specific, discrete spec- 
trum of frequencies. Only narrow lines of particular colors 
would appear in the spectrum (see Figure 1.2.1(b)). Within 
the classical framework of Newton and Maxwell there was 
no way to account for this phenomenon, because even ac- 
cepting the structure of the atom with a positive nucleus 
and orbiting electrons, there would be no discrete energy 
levels. Worse still: the electron would radiate and there- 
fore lose more and more of its energy and finally fall into 
the nucleus. This fundamental instability was basically re- 
solved by Niels Bohr in the quantum mechanical atomic 
model he proposed, and therefore the stability of all matter 
we observe is a direct consequence of its quantum nature. 
Bohr’s model for an atom predicted an infinite but discrete 
set of bound states with a single unique ground state with 
the lowest energy. And this discreteness accounted for the 
discrete set of lines in the atomic spectra. 


The Compton effect. This effect refers to the fact that 
when scattering a high frequency X-ray off a charged par- 
ticle like the electron, the radiation itself behaves much like 
a particle with an energy E and momentum p given by the 
Planck formula, i.e. 


E=cp=hy. (1.2.3) 


Furthermore, the conservation laws of energy and mo- 
mentum were respected in such scattering processes (see 
Figure 1.2.1(c)). This clearly suggested the later step made 
by Einstein who postulated the existence of the photon as 
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(a) Planck’s spectrum of black body radiation solves the ultraviolet 
(short wavelength) divergence of classical theory. 


recoiled electron ord 


electron at rest ~~ 


Scattered photon 


(c) Compton scattering, showing the particle properties of radiation. 


(b) Discrete lines in atomic spectra, indicating discrete energy levels of 
the atom. 


(d) Photo-electric effect with the frequency threshold for the current to 
flow. 


Figure 1.2.1: Meeting the challenge. Four crucial phenomena that early quantum theory successfully accounted for and where 


classical physics failed bitterly. 


the ‘particle of light’ with precisely the energy and momen- 
tum properties just mentioned. 


The photo-electric effect. This amounts to the effect that 
if we direct a light beam to a metal surface in a constant 
electric field parallel to the surface, a current might run 
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because electrons get excited from the surface and flow 
through the circuit (see Figure |.2.1(d)). The surprise was 
that the magnitude of the current did not depend on the in- 
tensity of the radiation in the way predicted by the classical 
theory. It turned out that a current would only start running 
if the frequency of the light in the beam passed a certain 
critical value. If the frequency was below that threshold, 
there would be no current irrespective of the intensity of 
the beam. 


This behavior was beautifully explained by Einstein in his 
1905 paper, using the particle-like interpretation of the ra- 
diation. Only if the energy of a single photon (given by 
Planck’s formula) became larger or equal to the binding 
energy of an electron in the metal, would the electron be 
liberated by absorbing the photon. The rest of the energy 
would be converted into the kinetic energy of that elec- 
tron. 


This concludes our brief summary of some of the deep 
crises that hit classical physics and that seeded the new 
paradigms of relativity and quantum theory. These consti- 
tute two turning points in our thinking that are unequalled in 
the history of science in the sense that they extended our 
understanding of the physical universe far beyond what we 
as humans could experience and perceive by direct sens- 
ing or observation. And to test these radical new ideas 
many new instruments and experimental techniques had 
to be developed as powerful extensions of the quite limited 
innate human ability to probe nature at very small or very 
large scales. Indeed, these radically new insights started 
a century of amazing progress, not only in physics and as- 
tronomy, but also in chemistry, material science and com- 
puter/information science. 


The physics of space-time 


The theories of special and general relativity, both largely 
connected with the person of Albert Einstein, showed that 
there is no objective way to separate time and space, there- 
by introducing the concept of space-time. In the special 
theory of 1905, this implied the unique role of the veloc- 
ity of light as a universal constant, and the equivalence 
of mass and energy. The general theory of 1915 further- 
more showed that space-time could be curved and had 
to be thought of as something dynamical. The concept 
of space-time changed from an external mathematical ab- 
straction to a physical entity, which itself carried energy 
and momentum. Einstein found the dynamical equations 
for the universe as a whole, as the inevitable consequence 
of this line of thinking. This means that we have to think 
of the universe we live in as a particular solution of the 
Einstein equations. 


Special relativity 


The theory of special relativity is based on two assump- 
tions: (i) the laws of nature should look the same for any 
set of observers that move with constant relative velocity 
with respect to each other, and (ii) the velocity of light in 
vacuum is exactly the same for all such observers. These 
assumptions, which have been confirmed by a wide vari- 
ety of precise experiments, have far-reaching implications: 
for example that the relative velocity between two moving 
objects can never exceed the speed of light c, but also 
that moving clocks tick slower. Probably the most well- 
known consequence is the equivalence of mass and en- 
ergy, so concisely expressed by the magnificent equation 
E = mc’. This equation opened the possibility of predict- 
ing processes, where mass could turn into other forms of 
energy such as radiation, and the other way around, where 
for example a high-energy photon could create a particle 
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Figure 1.2.2: Einstein. (Source: Wikimedia.) 


anti-particle pair. These processes found ample applica- 
tions in quantum physics, in particular nuclear and par- 
ticle physics, as well as in the medical world — think of 
positron-electron tomography, or PET-scanning, as a di- 
agnostic tool. 


Space-time four-vectors. From a conceptual point of view 
Einstein’s special theory of relativity introduced the notion 
of a flat four-dimensional space-time (also called Minkowski 
space-time) with four coordinates. These are usually de- 
noted by x" = (ct,x) with the index u = 0...,3 where 
the zero index denotes the time component. A point in 
space-time labels an instantaneous event that takes place 


at time t at a point x in space. Correspondingly, Ein- 
stein defined a four-momentum p" = (E/c, p) for a par- 
ticle,! where the energy became the time component of 
the four-momentum, with the usual spatial component p = 
mv. 


If two observers move with constant relative speed, their 
four-vectors that label a specific event, turn out to be ob- 
server dependent in a specific way. They would vary, but 
for the different observers the ‘length’ of the four-vectors 
has to be the same. This means that the space-time in- 
terval s for a given event, defined as s* = c?t? — |x|? has 
to be the same for different observers. And similarly, one 
may define the notion of rest mass mo, for a particle as 
mic’ = E? — |pc?, which is invariant, that is to say that 
it takes the same value for all relativistically equivalent ob- 
servers. 


The special theory of relativity makes the statement that 
the physics may look different for different observers, but 
a complete description can always be given in the frame 
of any observer. Furthermore, the theory tells you how to 
calculate what one observer should see if you know the 
observations from another one. It tells you how to trans- 
late any four-vector from one frame of reference to another. 
And equally important, it also tells you which are the in- 
variant quantities that will be the same in all frames. | em- 
phasize this point about frames here because interestingly 
enough we will encounter similar challenges if we are to in- 
corporate properties and frames of observers in quantum 
theory in a consistent way. 


Relativistic versus rest mass. Let us dwell a little more 
on the equivalence of mass and energy. We have so far 
given two expressions for the energy: one is the canonical 
E = mc’, and the other E? = mic‘ + |pi?c’, involving its 
rest mass. The latter formula is depicted in Figure 1.2.3. 


1The appearance of the velocity of light c, with units [m/s] in the 
above definitions, is natural as it ensures that the units of the four com- 
ponents of a relativistic vector are identical. 
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Figure 1.2.3: Relativistic particle energy. The relation between 
energy E, rest mass mp and momentum p and its limiting be- 
havior for p < moc and p > moc. 


The dashed red curve corresponds to the non-relativistic 
(Newtonian) limit with E = moc? + |p|?/2mo, whereas 
the dashed blue line corresponds to the ultra-relativistic 
limit where the energy is just proportional with the mo- 
mentum, E = pc. Indeed, the latter formula is just the 
expression for a massless particle like the photon. The 
picture demonstrates nicely how the properties of a rel- 
ativistic particle smoothly interpolate between Newtonian 
particle behavior and a photon. One can also say that the 
dispersion E = E(|p|) of the particle goes from quadratic 
to linear. 


From the two energy expressions, there follows a relation 
between the relativistic mass m and the rest mass mo, 
reading: m? = må + m’\v/?/c?. The conclusion is that 
in contrast with the rest mass mo, which is an invariant 
quantity characterizing the particle, the relativistic mass m 
is momentum, thus frame and observer dependent. The 
equation above tells us that m? = m3/(1 — v?/c?).2 If 


2From hereon we replace |v]? simply by v? for convenience. 


Newton: 
Gravitational force 
Planetary orbit 


Newton => Einstein 
Mass causes curvature 
of space 


Figure 1.2.4: Mass curves the surrounding space. Comparing 
the Newtonian paradigm, where masses cause a gravitational 
attractive force between sun and planet, and the Einsteinian 
paradigm where mass curves the space, and the gravitational 
interaction is induced by way the curved space affects the mo- 
tion of the planet. 


you want to accelerate a particle by applying a force, it is 
the relativistic mass m that comes in, and therefore parti- 
cles become effectively extremely massive if their velocity 
tends to the velocity of light. This in turn implies that to ac- 
celerate them further will cost ever more energy. A fact that 
people who run big accelerators are painfully reminded of 
every time they receive their utility bills! To be fair | should 
mention that in an accelerator a large fraction of the energy 
is lost due to the particles radiating. The relation between 
masses tells us that the relativistic mass goes to infinity if 
the velocity approaches the speed of light. No wonder we 
cannot push particles beyond that universal value! 


General relativity 


The general theory of relativity — often called GR by physi- 
cists — is the fundamental theory of gravity proposed by 
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Albert Einstein in 1915, where the gravitational force is a 
direct manifestation of the curvature of space-time. In Fig- 
ure 1.2.4 we have indicated the paradigm shift between the 
Newtonian and Einsteinian perspective on planetary mo- 
tion. In the Newtonian paradigm the sun and planet have a 
mass that causes a gravitational force between them, and 
that attractive force causes the planet to move in an ellip- 
tic orbit. In Einstein’s picture the masses curve the space 
around them which is therefore no longer flat. The planet 
then just feels the curvature of the space it is moving in 
which causes it to move in an (almost) elliptical orbit. The 
gravitational interaction is then induced by the curvature 
of space, like the trajectory of a marble on a rubber sheet 
deformed by the mass of a heavy bowling ball placed on 
it. The gravitational interaction manifests itself though the 
curvature of space-time. 


With GR, space-time became a dynamical part of our phys- 
ical universe. It was lifted from a bunch of silly coordinates 
to a fully interacting participant. Space-time was promoted 
from merely a static mathematical arena in which phys- 
ics unfolded, to a dynamical physical entity, representing 
physical degrees of freedom carrying energy and momen- 
tum itself. You could call it the ‘emancipation’ from passive 
mathematical framing to active physical reality. 


This development is analogous to electrodynamics, where 
initially the electromagnetic fields were considered as math- 
ematical constructs that could be used to calculate forces 
between charges and currents, and only with Maxwell’s 
treatise did it become clear that the fields themselves in 
a very direct sense represent the physics of electromag- 
netic radiation. Mentioning this analogy prompts the ques- 
tion of whether a gravitational analog of electromagnetic 
radiation exists. The answer, as we will see shortly, is af- 
firmative! 


General relativity demanded the use of a mathematics that 
was quite remote from the practicing physicist’s repertoire. 
The language in which gravitational physics was formu- 
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Figure 1.2.5: Bending of light by a mass. In a curved space- 
time light moves along shortest distance curves. This means for 
example that a light-ray emitted by a distant star will be bended 
if it passes the sun. This effect provided one of the early experi- 
mental confirmations of GR. 


lated changed from the Newtonian dynamical systems per- 
spective to full fledged Riemannian differential geometry. 
Relativity marked the beginning of a new golden age of 
geometry in physics. That is a good reason to include a 
separate section, following this one, entitled The physics 
of geometry, which provides an introduction to the basic 
concepts in the mathematics of curved spaces. Concepts 
that have proven to be as elegant as useful in many do- 
mains of modern physics. 


Seven predictions. The theory of General Relativity made 
seven almost independent predictions that in the past cen- 
tury have, one after the other, been confirmed experimen- 
tally. They are now part of the vast body of experimental 
evidence supporting the theory. We list them here with a 
brief explanation: 


(i) Bending of light. Generally the geometry of space-time 
depends on the energy and momentum distribution of radi- 
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Figure 1.2.6: The perihelion precession of Mercury. 


ation and matter in it, and in turn that geometry influences 
the motion of that matter and radiation as through the grav- 
itational force acting on them. We have indicated this ef- 
fect in Figure 1.2.5. This was measured by the British as- 
tronomer Sir Arthur Eddington’s expedition in 1918 during 
a solar eclipse, and provided one of the first solid confir- 
mations of Einstein’s theory. 


(ii) The perihelion precession of planetary orbits. Another 
notable aspect of General Relativity is that it predicts a 
deviation from the strictly elliptic orbits for planets. In the 
Newtonian picture the axes of the ellipse are fixed in space, 
while in the Einsteinian picture the ellipse rotates slowly in 
the plane of the orbit as we have schematically illustrated 
in Figure 1.2.6. One way to understand this is that in Gen- 
eral Relativity the effective gravitational force that a static 
source like the sun exerts on an orbiting planet differs from 
the Newtonian one. If one expands the potential in powers 
of (IL|/mcr)? , one finds that: 


3IL? 
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where m and M are the earth’s and solar masses, and |L| 


(1.2.4) 
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Figure 1.2.7: Gravitational redshift. \f a photon loses energy 
to the gravitational field moving away from a star its wavelength 
will increase and gets redshifted. Similarly, due to the expansion 
of the universe the wavelength of light emitted from far away 
objects is also shifted towards the red. 


is the angular momentum of the earth. The thing to note is 
that the Newtonian inverse square law gets a 1/r* correc- 
tion. The effect is the largest for the inner planets (small r), 
for Mercurius the precession amounts to 43 seconds of arc 
per century. This very slow precession had in fact already 
been observed before the advent of GR at the end of the 
19th century. 


(iii) Gravitational redshift. In GR matter and radiation in- 
teract with space-time, which means that there will be an 
exchange of energy between the gravitational and non- 
gravitational degrees of freedom. So if a photon is emitted 
from a nearby heavy object like a star and moves radially 
out to some distant observer, it has to climb out a grav- 
itational potential well and will thereby lose energy. For 
a single photon this means that the frequency will come 
down and therefore the wavelength has to increase. The 
light will therefore be shifted towards the long wavelength 
or the red end of the spectrum. This effect is called grav- 
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itational redshift denoted by z where the ratio between 
observed and emitted wavelength is defined by the red- 
shift like 1 + z = Aogps/Aem. This gravitational redshift is 
also predicted to exist for photons coming toward us in an 
expanding universe, and was the crucial ingredient in the 
demonstration by Edwin Hubble that our universe is actu- 
ally expanding. This will be discussed in far more detail in 
the next subsection. 


(iv) Gravitational waves. In a moment we will discuss how 
these waves were discovered in 2015, exactly one hun- 
dred years after their existence was predicted. Gravita- 
tional waves are waves in the fabric of space-time that 
travel with the speed of light. As we have seen the gravita- 
tional coupling constant, which is Newton’s constant Gn , 
is extremely small compared to the electromagnetic cou- 
pling e. This implies that one needs violent motions of 
enormous masses to generate gravitational waves that are 
energetic enough to be detected. For example when black- 
holes form or collide, there will be huge amounts of en- 
ergy converted to space-time degrees of freedom. The 
existence of the waves was one of the early predictions of 
Einstein’s theory, by making a linear approximation to the 
empty space Einstein equations one does indeed find lin- 
ear wave equations very much like the equations for elec- 
tromagnetic waves. It took about a century before this type 
of radiation was first observed directly on 14 September 
2015 by two gravitational wave detectors in the US. 


The LIGO project proposed to detect gravitational waves 
with the use of two giant interferometers. An impressive 
international effort by the US, the UK, Germany and Aus- 
tria, that altogether took some 30 years to complete, re- 
sulted in the LIGO observatory. Each interferometer takes 
a laser beam, splits it in two and sends it down two legs 
at right angles to each other see Figure 1.2.9. At the end 
of each of the legs are mirrors, which bounce the beams 
back to the center. If there is any difference in the leg 
length, say caused by the passing of a gravitational wave, 
the two recombined laser beams create an interference 


Figure 1.2.8: Two colliding massive objects. The wavelike 
space-time profile caused by two extremely massive objects, like 
black holes, colliding. (Source: LIGO) 


pattern. The LIGO setup was extremely sensitive: it could 
detect a change in the length of a leg (~ 10°m), on the 
order of the diameter of a proton (~ 10715m). 


The researchers managed to work out the source of the 
signal, because their model fitted the data so well. Sup- 
posedly it was two black holes, 29 and 36 times heavier 
than the sun merging into a single black hole of 62 solar 
masses (see Figure 1.2.8) meaning that 3 solar masses 
were emitted in the form of gravitational radiation! As a re- 
sult of the fundamental importance of the discovery meant 
that in 2017, the Nobel prize in Physics was awarded to 
Rainer Weiss, the other half jointly to Barry C. Barish and 
Kip S. Thorne, ‘for decisive contributions to the LIGO de- 
tector and the observation of gravitational waves.’ 


We know that electromagnetic radiation when quantized is 
directly linked with a massless particle called the photon. 
Likewise, gravitational waves correspond to a massless 
quantum particle called the graviton. As | have said, it cou- 
ples extremely weakly and therefore will not play any role 
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Figure 1.2.9: A LIGO gravitation wave detector. An aerial pho- 
tograph of one of the two gravitational wave interferometers. A 
laser beam gets split after which the beams travel forth and 
back through two orthogonal legs. If a gravitational wave passes 
through one of them, the two signals show a detectable phase 
difference after returning. (Source: Advanced LIGO) 


in ordinary high-energy accelerator experiments. There is 
a second fundamental difference between the photon and 
the graviton: the former is a spin-one particle, and the lat- 
ter has spin two. This comes about because electromag- 
netic waves are dipolar, while gravitational waves have a 
quadrupole moment. In modern views on gravity people 
tend to think of the gravitational interactions as an emer- 
gent phenomenon, which means that Einstein’s equations 
correspond to an effective theory of space-time. It could be 
that there are more fundamental degrees of freedom (like 
so-called superstrings or D-branes) that space-time is re- 
ally composed of. In that case the quantization of gravity 
would start from there, and the graviton would rather be a 
collective excitation, a so-called quasi-particle. 


The remaining three predictions of GR are: 
(v) The existence of black holes, 

(vi) The expanding universe, 

(vii) A cosmological constant. 


These are of fundamental interest in modern physics and 
therefore we will discuss them separately. The expand- 
ing universe and the role of the cosmological constant are 
the subject of the next subsection on cosmology, while we 
will discuss some aspects of black holes in the concluding 
section of next chapter on page 139. 


Big Bang cosmology 


The Einstein equations are nothing less than a set of equa- 
tions for space-time as a whole, which means that our uni- 
verse should correspond to one of the solutions. These 
equations have played a glamorous role in 20th century 
physics and created the astoundingly successful field of 
observational cosmology. There are many good reasons 
to present the modern view on the cosmological evolution. 
It corresponds to the hot Big Bang model described by the 
Friedmann equation, generalized by Lemaitre to include 
the effect of the cosmological constant. This model de- 
scribes the dynamical arena in which the world became 
the way we know it. In the third part of the book we de- 
scribe in more detail the physical processes that took place 
at the very early stages of the universe. We will come 
to appreciate that the combination of understanding ba- 
sic quantum physics, and cosmology based on GR, leads 
to an impressive account of the evolutionary process to- 
wards an increasing complexity in inanimate matter that 
preceded the Darwinian biological evolution. Indeed it took 
the universe billions of years to produce the chemical build- 
ing blocks of life. 


The Friedmann—Lemaitre equation. GR in its full gen- 
erality is quite complicated. However, with a number of 
simplifying (yet entirely justifiable) assumptions about the 
structure of our universe, the general equations can be 
reduced to two strikingly simple equations. The assump- 
tions are referred to as homogeneity and isotropy, where 
the meaning of the first is that the universe is the ‘same’ 
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at any place at any given instant in time, and the second 
means that the universe looks the same in any direction 
at any given instant. And in fact one can show that the 
second assumption is implied by the first but not the other 
way around. The first of the resulting equations basically 
expresses the conservation of energy. The second is the 
so-called Friedmann equation, named after the versatile 
Russian mathematician and engineer Aleksandr Aleksan- 
drovich Friedmann, who proposed the equation in 1922.° 
The equation reads: 


ke, (1.2.5) 
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where a = a(t) is the scale factor, a measure for the rela- 
tive size of the spatial universe. You may think of a as the 
relative average distance between two galaxies, meaning 
that the distance d(t) between the two galaxies at time t 
would be proportional to a(t): d(t) = a(t)do. The dis- 
tances between objects co-moving with respect to the ex- 
pansion grow proportional to a(t), where in addition we 
have made the choice that a(o) = ag = 1. On the right-- 
hand side we have the total energy density p = p(a). 
Clearly, this is the equation that governs the possible dy- 
namics of homogeneous/isotropic universes. The ‘curva- 
ture constant’ k, which can be scaled to take the values 
1,0 or —1, determines whether the space is closed like 
a sphere, flat, or open like a hyperboloid as illustrated in 
Figure 1.2.10. As we will see the k-value also decides 
whether the universe will ultimately end in a big crunch 
(k = 1), keeps expanding (k = —1), or sits in the critical 
state (k = 0) just in between. 


Friedmann sent the equation to Einstein, showing that it 
had no static solution but did have a solution correspond- 
ing to an expanding universe originating from an initial sin- 
gularity. Einstein didn’t like the equation, while acknowl- 
edging that it was mathematically correct, he thought it was 
unphysical and ‘suspicious’ exactly because it predicted 


3Many physicists also link the names of Lemaitre and De Sitter to 
this law. 


Figure 1.2.10: Curvatures. The closed, open and flat curva- 
tures corresponding to k = 1, —1, and 0, respectively. 


an expanding universe. He then put considerable effort in 
neutralizing the expansion by adding the so-called cosmo- 
logical constant A^ , without much success. Important work 
generalizing Friedmann’s work including the cosmological 
constant in 1927 by the Belgian priest and mathematical 
astronophysicist Georges Lemaitre confirmed the expan- 
sion. 


The real breakthrough came with the mind- and universe- 
blowing 1929 observations of Edwin Hubble in , which pro- 
vided the experimental confirmation of the expansion. It 
was only then that Einstein realized the great importance 
of Friedmann’s work and how he had missed a unique op- 
portunity to make one of the greatest predictions in the 
history of science. Later in his life he called the intro- 
duction of the cosmological constant in his striving for a 
static universe the ‘biggest blunder in my life” After the 
expansion was established the new parameter alias cos- 
mological constant silently faded away, until recently when 
it rather ironically made a glorious and dramatic comeback 
in a more subtle guise as a term representing the vacuum 
or dark energy. 
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Figure 1.2.11: Hubble law. Plotting recession velocity of distant 
galaxies versus their distance gives the linear relation v = Hod 
which is Hubble’s law. Ho is thus the tangent of the angle which 
the line through the data points makes with the horizontal axis. 


The Hubble parameter. An important observable quan- 
tity is the Hubble parameter, or expansion rate, or relative 
expansion velocity defined as: 


(1.2.6) 


So, if we observe the present value Hg for the Hubble pa- 
rameter and we determine the total energy density po , and 
put those values back into the Friedmann equation, that 
would tell us whether k is positive or negative or zero. So 
in that sense the density determines our destiny. The in- 
between k = 0 case at present (t = to) defines a critical 
density: me 

Pcrit = = 2 (1.2.7) 
Let me first go back to the definition of the Hubble param- 
eter in equation (1.2.6). If we write it out explicitly for the 
present time it has a nice interpretation: 


da 


Zah =f 
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v=Hod. (1.2.8) 


| read the correspondence as follows: looking from any 
fixed point in space, | see distant objects at distance d = 
ao receding from me with a velocity v = (da/dt)o then the 
relation just reads: v = Hod. This is the celebrated Hub- 
ble law, and depicted in Figure 1.2.11. Clearly the slope 
in the observed v — d plot gives you the observed value 
for Ho. The redshift observations by Hubble in 1929 was 
one of the great discoveries of 20th century (astro)physics 
because it implied that our universe was expanding. A fact 
that — as mentioned — Einstein himself up to that moment 
did not believe to be possible. 


A mechanical analogue. To get a better understanding 
of the expanding universe we are going to massage the 
Friedmann equation into a more familiar form, so that we 
can apply some of our conventional intuitions. Let us first 
put the constant Hg back into the Friedmann equation and 
write it as follows: 

daz 


(— 


= —H2 —k 2 
dt +V(a) c ) 


(1.2.9) 
where V = a’p(a)/Pecrit is some effective ‘cosmological’ 
potential. In the modern approach the (relative) energy 
density has three parts, referring to radiation, matter and 
the vacuum respectively, thus we write: 


V(p, a) =F +=" + 0,04). (1.2.10) 
where the omega’s are the present values fo the relative 
energy parameters to be obtained from observation. As 
| alluded to before, the vacuum term is a remake of Ein- 
stein’s cosmological constant. It has to be added because 
other dramatic recent observations have shown that the 
term is actually there. To understand what all of this means 
we have plotted the potential for equal values of the ©’s in 
Figure 1.2.12. The qualitative behavior is rather easy to 
understand: as indicated in the figure, for small a the ra- 
diation component dominates, because it comes with the 
1/a? factor. For large a it is the vacuum term which dom- 


inates as it comes with the a? factor. Note that the vac- 
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uum energy causes an expansive force, it remarkably cor- 
responds to a gravitational repulsion or a negative pres- 
sure. The potential is certainly unusual because it has no 
stable minimum, it runs off to minus infinity, both for a go- 
ing to zero and for a going to infinity. It is strikingly different 
from, say, the good old harmonic potential of Figure 1.1.13. 
It is inverted, we have turned it upside-down! 


To nevertheless make sense out of it let me remind you of 
equation (1.1.5) from Chapter 1.1, where we derived the ex- 
pression for the conserved total energy of a particle mov- 
ing in a potential as: 


] 
E= zm? + Vix), (1.2.11) 


where the total energy E is a sum of the kinetic energy and 
potential energy V(x). But, lo and behold, that is — up to 
some substitutions (m = 2/H3, x =a, v = da/dt, and 
the conserved E = —kc”/H}) — exactly the same as the 
Friedmann equation (1.2.9). 


How remarkable, we have ended up with a one-particle 
mechanical analogue in 1-dimension for the 4-dimensional 
universe! That is apparently what cosmic scenarios look 
like: just kicking a marble and looking at how it is running 
up and down hill! | don’t know who ordered that pizza, but 
I'll certainly eat it! 


The effective cosmological potential V(a) looks generically 
like the dark blue curve in Figure 1.2.12. As we have men- 
tioned, this potential has no stable minimum and in fact has 
two singularities, one at a = 0 and the other at a = oo. 
Apparently there is no fixed scale for the marble-universe 
to come to rest. Now this is the joy of analogues, they force 
you to think about what these strange singular features 
could possibly mean. Cognitive laziness does not suffice, 
we have to think! Figure 1.2.13 shows what the equations 
are trying to tell us. Well, the singularity at a = O rep- 
resents the dramatic event which we called the Big Bang. 
You could think of it as a marble being shot uphill with con- 
siderable kinetic energy so that it can climb the mountain 


Figure 1.2.12: The effective cosmological potential. The three 
terms in the generic cosmological potential V(a). The regions 
in a where the different contributions dominate are indicated 
(meaning that they are closest to the dark blue curve represent- 
ing the total potential). For small a radiation dominates, for in- 
termediate scales it is the matter term, while for large values of 
a the repulsive vacuum term takes over. 


from the left. How high? Well, that depends on how hard it 
gets kicked. If it is kicked a little, it will roll back, and if we 
slam it hard it will move all the way up, go over the hill and 
start an infinite descent into another special state. In the 
latter case the marble-universe keeps accelerating if the 
vacuum energy density is non-vanishing, causing a race 
to the bottom on the other side of the potential, a bottom 
that isn’t really there! It describes a state where the uni- 
verse keeps expanding in an accelerating mode forever, 
and the matter and radiation will thin out forever with their 
densities approaching zero. 


In Figures 1.2.13 and 1.2.14 the same three scenarios are 
depicted: the first shows the potential energy as function 
of scale factor, and the second the scale factor as function 
of time. They show three distinct possibilities (with non- 
vanishing vacuum energy): 
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Crunch 


Forever expanding 


Figure 1.2.13: Evolution scenarios in the potential energy land- 
scape. The universe with total energy equal to the lines labeled 

= +1. For negative energy (p > perit) the evolution follows 
the green arrows and the universe starts climbing up the poten- 
tial barrier up to the green line and starts falling back towards a 
big crunch. If the energy is positive, (9 < Perit ,) corresponding 
to the red (k = —1) and blue (k = 0) arrows, the universe easily 
climbs over the hill and starts accelerating indefinitely. 


(i) The green scenario with p > perit OF k = +1 is ending in 
a Big Crunch, because the total energy corresponding to 
—k/H2 = —1/2H¢ is not enough to get us over the top. At 
the point where the marble is turning around, its velocity is 
zero, which means that all the energy is just potential en- 
ergy. Consequently the point where the total energy line, 
corresponding to k = +1, intersects with the blue poten- 
tial energy curve is precisely the turning point of the green 
arrow that represents the trajectory of the universe. 

(ii) In the red scenario with p < Perit, OF k = —1 the mar- 
ble moves over the top after which the expansion will go 
on forever. In this case, there is not enough matter (and 
radiation) energy to pull the matter back in. 

(iii) The k = O case is of particular interest. If there is a 
non-vanishing vacuum energy, the top of the potential is at 
an energy below zero, which means that in the k=0 case 


Q,,=0.3 Q)=0.7 


Figure 1.2.14: Cosmological evolution scenarios. The solutions 
for the cosmic scale factor a as a function of the time for different 
choices of the (non-zero) relative mass and vacuum densities. 
The green scenario is a collapsing universe ending up in a Big 
Crunch. The blue graph on top represents our so-called Big Chill 
universe, it keeps expanding. Compare with the previous figure. 


the marble still has a non-vanishing velocity at the top and 
will therefore move over hill entering the domain of eternal 
expansion. 


The Einstein universe. One could imagine cooking up a 
special case where the top of the potential would exactly 
touch the k = +1 line In this case the marble would end up 
exactly on the top, where in principle it could stay forever. 
Forever? But wait, this is like putting a marble on top of a 
bald head, there is indeed a fixed point, but it is clearly un- 
stable, as any little perturbation will make the marble move 
one way or the other. In that special case the decision on 
the fate of our universe would be postponed! The future 
of the universe would boil down to tossing a coin! The 
very special solution where the universe just sits forever 
on top corresponds to the completely static universe that 
motivated Einstein to introduce the cosmological constant 
(or vacuum energy term) in the first place. He apparently 
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didn’t check the stability of the solution. 


Figure 1.2.15: De Sitter and Einstein in 1932. 


De Sit- 
ter and Einstein discussing some non-static solutions for the 


universe. (Source: https://repository.aip.org /islandora /object- 
/nbla:288847.) 


Vacuum energy. From the last figures it is also clear that 
the vacuum energy term is peculiar in that it causes an out- 
ward directed force: it acts like a negative pressure term. 
It is apparently a form of energy that gravitationally repels! 
You might be tempted to think of anti-matter, but that can’t 
be it because anti-matter has positive mass, so gravita- 
tionally it is attractive like ordinary matter, but this vacuum 
stuff is peculiar and is really repulsive! In the top-blue and 
red scenarios we see that for large times the behavior is 
completely determined by this vacuum contribution, so let 


us see what happens to the scale parameter in that case. 
If we go back to the Friedmann equation (1.2.9) and only 
put in the dominant vacuum contribution (OQ, = 1,k = 0) 
and bring the a? factor to the other side, we get: 


(1.2.12) 


This equation is simple to solve* and yields an exponential 
expansion: 
a(t) = apet, (1.2.13) 


This exponentially expanding solution is called the De Sit- 
ter universe, after Willem de Sitter, the Dutch astronomer 
who came up with the solution already in 1917. So in the 
third picture we see the top-blue and red arrow indeed 
starting to go up exponentially. This solution played an 
important role in the debates that Einstein and De Sitter 
(see Figure 1.2.15) had about the various non-static uni- 
verses. 


Cosmic event horizon. Expanding universes have the 
interesting but somewhat puzzling property that if things 
move away from me at a velocity proportional to their dis- 
tance, then inevitably at some distance things recede with 
a faster then the speed of light. This clearly happens as 
soon as r > Ry, where 


c 


RH=—. 
H EF 


(1.2.14) 
Can it then be that ‘things’ move faster than the speed of 
light? Doesn't that make Einstein turn in his grave? Actu- 
ally he will not, as his velocity veto concerns relative veloc- 
ities at a given point in space-time. So indeed, expansion 
velocities of remote parts of space exceeding c are ad- 
missible, and are inevitable in expanding universes. They 
have a clear physical interpretation, in that they imply the 
existence of a cosmological horizon. In Figure 1.2.16 we 
have sketched the situation. We imagine ourselves to be 
at the centre with concentric spheres around us. Points on 


4We will solve it in the Math Excursion on functions in Part II. 
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Figure 1.2.16: The cosmic event horizon. This horizon is de- 
fined as the surface around us where the speed due to the ex- 
pansion equals the speed of light c. Messages sent now from 
any point in the black region beyond the horizon would never 
reach us. 


the sphere with radius r move away with the Hubble veloc- 
ity v = Hor and the horizon corresponds therefore to the 
sphere with r = Ry = c/Ho. What this means is that if 
somebody beyond that horizon decides to send us a light 
signal at this very moment, that signal will never be able to 
reach us, because it will not be able to cross the horizon. 
This horizon is at a distance of about 13.7 Giga light years, 
‘far out’ so to speak. Very far away and nothing to worry 
about. That’s what you would think, but after Stephen 
Hawking’s great discovery that horizons have very physical 
properties: they are a source of thermal black body radia- 
tion. Therefore adventurous physicists have been specu- 
lating about the conceivable roles this horizon might play 
in the explanation of contemporary cosmological observa- 
tions like dark matter and dark energy. We will comment 
on these ideas later on. 


Cosmic inflation 


Problems with the standard expansion model. 


Particle horizons. We now turn to the phenomenon of a 
particle horizon. This type of horizon should not be con- 
fused with the cosmic event horizon, as it has a very differ- 
ent origin; the existence of a particle horizon derives from 
the fact that the universe had a beginning. That means 
that for any observer at any given instant in time, there is a 
specific ‘particle horizon.’ Light emitted from points beyond 
that horizon never had time enough to reach us. Basically 
the particle horizon defines the size of the observable uni- 
verse at any given instant, and the definition naturally im- 
plies that the observable universe grows as time goes by. 
This is schematically illustrated in Figure 1.2.17. The par- 
ticle horizon is just the intersection of our past light cone, 
with the spatial surface where the time equals zero. This 
figure also illustrates the notion of a causally connected 
domain, since it has half the radius of the particle hori- 
zon. It is the domain in which any point would have had 
enough time to communicate with any other point in the 
domain. It is important to note that the younger the uni- 
verse is the smaller the size of a causal domain. So our 
observable universe breaks up into ever more causal do- 
mains if we go back in time. And this leads to a problem 
with the standard big bang model and observations that 
we turn to next. 


The (particle) horizon problem. The ever smaller size of 
particle horizons at earlier epochs of the universe create 
a notorious paradox known as the ‘horizon problem.’ This 
problem concerns a conflict between present-day observa- 
tions and the original Friedmann-Lemaitre expanding model 
of the universe. We at present observe the cosmic back- 
ground radiation from all directions in the sky. This ra- 
diation was emitted at the moment that electrically neu- 
tral atoms formed, when the universe was about 300,000 
years old. That radiation did not interact ever since, it de- 
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Figure 1.2.17: Particle horizons. We have sketched the particle 
horizon, which defines the size of the observable universe, at 
present (t = to) and an earlier instant (t = tı ) . It is defined as 
the present size of the domain that at t = 0 was contained in 
our past light cone, the dark blue arrow. For conceptual clarity 
the figure features a flat universe with a beginning. It illustrates 
the fact that our present causal domain breaks up in many inde- 
pendent domains at early times. 


coupled from the matter. And that is the reason why we ob- 
serve a perfect thermal spectrum now, which is redshifted 
because of the expansion of the universe after the decou- 
pling took place. It constitutes the strongest direct obser- 
vational evidence for the expansion of our universe. It ap- 
pears exactly as predicted. However, there is something 
puzzling here: the radiation that comes to us from oppo- 
site sides of the universe shows exactly the same spec- 
trum apparently originating from the same thermal plasma. 
How can that be? Because at the time the photons decou- 
pled, the places where that radiation originated were not 
within one causal domain. To get an idea, let us ook at 
Figure 1.2.17. If we imagine the radiation to be released at 
t = ty, then it can have equilibrated over distances cor- 
responding to the size of the causal domain with radius 
Rovs(t1) as indicated in the figure. My causal domain con- 
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Figure 1.2.18: Causal domains. The inflationary universe has a 
brief inflationary epoch of less then 1073? s shortly after the big 
bang, in which it expanded exponentially with a factor of 107”. 


sists of all the points from which information could reach 
me within the age of the universe tọ. That would mean 
that at present we would be able to make observations all 
the way out to Rops(to), a much larger domain. Thus we 
would not expect the perfect black body spectrum we hap- 
pen to observe. In other words at the time of decoupling 
the region corresponding to our presently observable uni- 
verse contained many causal domains. At that age of the 
universe, there had not been time enough to reach ther- 
mal equilibrium over distances that comprise the total ob- 
servable universe at present. This is an irrefutable fact if 
we assume that the standard expansion of the universe is 
correct. And this fact poses a serious problem for the stan- 
dard Friedmann-Lemaitre model. This problem has been 
resolved by making a major amendment to the course of 
events in the very early universe, This is a fundamental 
update: the expanding universe 2.0, also called the infla- 
tionary universe. But before we get into that we want to 
first mention another problem with the standard cosmolog- 
ical model. 


74 CHAPTER 1.2. THE AGE OF GEOMETRY, INFORMATION AND QUANTUM 


The flatness problem. The flatness problem is posed by 
the observation that fitting the model to the data the con- 
clusion is that we live in a universe where the curvature 
constant k is very close to zero. From a theoretical point 
of view there is no reason to expect it to be zero, it must 
have been zero all along. From the fact that our universe 
after 13.7 billion years has a k value so close to zero, one 
may show by calculating backward that this would impose 
a very unnatural initial condition on the universe. One finds 
that the value of the curvature constant would have to be 
fine-tuned to zero to some sixty decimal places! That is 
considered to be an exceptional choice, which begs for an 
explanation. It turns out that there is a satisfactory solution 
to this problem and again it involves the vacuum energy 
and the De Sitter solution. 


If you go back to the Friedmann equation (1.2.9) and look 
at the right-hand side, you see that the vacuum energy 
is a constant positive part of the density p. However this 
constant is multiplied by a? , and thus this term (if present) 
will under all circumstances grow faster with respect to the 
second term that corresponds to the curvature constant k. 
What this means is that a universe that goes through such 
an exponential phase will blow up and effectively become 
flat. The situation is somewhat analogous to the claim by 
some Dutch people that their country is flat; it is indeed ef- 
fectively flat, but not really. It is better to say that the curva- 
ture radius of the earth is much larger than their visual hori- 
zon. Going back to the universe, what this means is quite 
interesting. If you could turn the vacuum energy on for a 
limited amount of time, the exponential expansion would 
basically flatten out the universe. This is a vital observa- 
tion because it would furnish a dynamical mechanism by 
which the universe drives itself to that unique point in the 
solution space where k is effectively zero! The universe 
would end up being flat, becoming open independent of 
the initial situation. 


What do the experiments tell us? The data unequivocally 
suggest that there must have been a brief period in the 
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Figure 1.2.19: Cosmic inflation. Inflation makes the present ob- 
servable universe fit in the expanded image of a causal domain 
of subnuclear size. 


very early universe, where it expanded exponentially. And 
that brief period of cosmic inflation as it is called is the 
reason we find ourselves in a flat (k = 0) universe now. 
We have pointed out two serious problems where the stan- 
dard cosmological model clashes with the data, and both 
are resolved in the inflationary scenario to which we turn 
next. 


The inflationary scenario. 


The inflationary scenario involves a non-vanishing vacuum 
expectation value of a so-called inflaton field, presumably 
some scalar field that has not really been identified. In a 
very, very early stage of the universe, say, at t ~ 10735 
seconds, due to the cooling of the universe this inflaton 
field gets stuck in a metastable vacuum state. This means 
that it generates a constant vacuum energy in the universe, 
and this will last for about t ~ 107°? seconds, after which it 
will decay to a new lower zero energy ground state. During 
this period with the non-vanishing vacuum energy present, 
the universe would inflate the linear dimension of the uni- 
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verse by a factor of 107 (corresponding to 10°! for the vol- 
ume). Inflating a causal domain by such a huge factor 
solves the horizon problem as is indicated in Figure 1.2.19. 
The epoch ends with a phase transition of the early uni- 
verse as a whole. The latent heat released in this transi- 
tion will be converted into ordinary matter. Such a scenario 
implies a drastic revision of the very early stages of the 
standard cosmological model. Note that though the time 
periods appear to be extremely short, this is only relative, 
the inflationary epoch lasts a 1000 times the age of the 
universe at that time! You could therefore equally well say 
that it took ‘ages.’ 


There is one other observational aspect of early universe 
cosmology that this scenario gives an answer to. The enor- 
mous inflation factor basically implies that our whole ob- 
servable universe originates from an extremely small do- 
main before inflation started. The domain would be so 
small that the physics within that domain would be gov- 
erned by quantum theory. That particularly implies that 
within such a domain of size Ax there are substantial quan- 
tum fluctuations, and that these fluctuations have a flat, 
scale invariant spectrum, meaning that their amplitude is 
independent of their wavelength. These small wavelength 
fluctuations (A < Ax) are blown up to large-scale inhomo- 
geneities by the inflation. And it is believed that these inho- 
mogeneities are the seeds of large-scale structures in the 
subsequent evolution of the universe. Knowing the initial 
spectrum at the end of the inflationary epoch allows one to 
predict what the inhomogeneities and anisotropies in to- 
day’s cosmic background radiation would be. And indeed 
the scale invariant initial spectrum evolves in a highly non- 
uniform distribution with damped oscillations which agrees 
extremely well with what has been observed by space ob- 
servatories like WMAP and PLANCK as is shown in Figure 
1.2.20. 


This surprising scenario combines knowledge from the mi- 
croscopic realms of quantum field theory, with knowledge 
from general relativity and cosmology and allows for a so- 
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Figure 1.2.20: CMB anisotropy. The inflation blows tiny quan- 
tum fluctuations in the initial causal domain, up to large-scale 
inhomogeneities that are believed to be the seeds of larg- scale 
structure in the present universe. These show up in the angular 
correlations in the spectrum of temperature anisotropies of the 
cosmic background radiation. From this data the three energy 
parameters and the cosmic curvature constant in the model can 
be determined. (Source: PLANCK mission) 


lution of both the horizon and the flatness problem of stan- 
dard cosmology. Scenarios of this type were proposed and 
developed in the early 1980s by Alan Guth from MIT, An- 
drei Linde presently at Stanford University, and Paul Stein- 
hardt presently at Princeton University. 


Splendid observations. Having presented these fasci- 
nating theoretical considerations, let us briefly review the 
stunning progress that has been made in observational as- 
tronomy and cosmology. The fundamental observational 
parameters in the cosmological models are the energy den- 
sities Q; , and the Hubble constant and these basically tell 
you what the curvature constant k is. Two completely dif- 
ferent techniques have been used: 


(i) The measurement of very distant Supernovae type | 
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events. These are basically very remote sources that al- 
low us to extend the Hubble law plot of Figure 1.2.11. A 
great experimental effort by Saul Perlmutter and collabora- 
tors (1998) managed to expand the diagram by a factor of 
ten, and the spectacular discovery they made was that the 
plot does no longer stay linear but is curving upward. This 
means that at large distances we see the expansion ac- 
celerate. They fitted the data and extracted the O values, 
and clearly obtained a positive contribution for the vacuum 
term. With Adam J. Riess and Brian P. Schmidt, Saul Perl- 
mutter was a co-recipient of the Nobel prize for Physics in 
2011, and the prize was awarded ‘for the discovery of the 
accelerating expansion of the Universe through observa- 
tions of distant supernovae. 


(ii) The measurement of the curvature through measur- 
ing the anisotropy in the Mmcrowave background radia- 
tion also gives — among many other things — the Q values. 
There have been a number of space telescopes to do this 
kind of work: first the COBE (1992), then WMAP (2003) 
and most recently the PLANCK (2013) mission, with again 
startling results. For this line of research the Nobel Prize 
in Physics of 2006 was awarded jointly to John C. Mather 
and George F. Smoot of the COBE collaboration ‘for their 
discovery of the blackbody form and anisotropy of the cos- 
mic microwave background radiation.’ 


Concerning the relative energy densities, the upshot of 
these experiments is summarized in the energy piechart 
depicted in Figure 1.2.21. After the PLANCK mission the 
preferred fractions are: 68.3 % is in the form of vacuum 
or dark energy, 26.8 % in the form of dark (not luminous) 
matter and only 4.9 % is in the form of ordinary luminous 
matter. The conclusion is crystal clear: we are living in 
a vacuum dominated, flat universe! What that means ul- 
timately is also not hard to understand. The remarkable 
message is that 95 % of all the energy in the universe re- 
sides in the dark matter and energy components, and is 
therefore in a form that is unknown to us! It reminds us of 
the words of the Chinese philosopher Lao Tzu: ‘The more 


Ordinary matter 


Dark matter 


Figure 1.2.21: The energy piechart. A piechart of the relative 
contributions of dark (vacuum) energy, dark matter and ordinary 
matter as determined by the WMAP and PLANCK space obser- 
vatories. Conclusion: our universe is vacuum dominated. 


we know, the less we understand!’ More than anything 
else, science is the story of work in progress, every time 
reminding us of our ignorance, and forcing us to cope with 
it. Or to find creative ways to beat it. Indeed, science will 
always run into new walls, or to be more encouraging: new 
profound challenges. 


Today’s challenges. We have to conclude that looking at 
the presently available data, this consistent and convinc- 
ing, evidence-based inflationary model of the evolution of 
the universe still leaves us with some big puzzles. 


Dark matter. The first is the question what actually is dark 
matter. Clearly this is a question that has been taken up 
by the particle physicists who built their Large Hadron Col- 
lider (LHC) at the European accelerator center CERN in 
Geneva. They are presently hunting for a new particle 
type that would fit the profile of dark matter. Such particles 
should be ‘sterile, meaning that they interact very weakly 
with ordinary matter and they should be massive in order 
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to cause the gravitational effects we observe. They are 
expected to form a species of so-called WIMPs: Weakly 
Interacting Massive Particles. Theoretical candidates are 
for example the species of lightest supersymmetric parti- 
cles (a necessary ingredient of String Theory), or various 
types of massive particles that are called ‘sterile’ neutrinos 
(fitting in certain Grand Unified Theories). 


Dark energy. The second even more profound puzzle is 
the observed non-zero value of the cosmological constant, 
or vacuum energy. It is ‘small’ but definitely non-zero, and 
the question is wheter we can find a theoretical explana- 
tion for its existence and its magnitude. The irony is that 
physicists have for quite some time been looking for argu- 
ments why it would have to be strictly zero exactly because 
there was an extremely strong bound on it from observa- 
tion. They looked for a principle that would protect the zero 
value of the cosmological constant, like the gauge principle 
protects the zero mass property of the photon. Needless 
to say that they didn’t succeed, fortunately in fact, because 
now we know that it is not zero to start off with. Answering 
this question requires a fundamental insight into the nature 
of the vacuum, and so far there is no way to calculate the 
quantum energy of the vacuum from first principles. Such 
an explanation should also allow us to make a first esti- 
mate of its magnitude, because in spite of the fact that it 
is the dominating energy content of the universe, its actual 
value is mesmerizingly small: A = 1.1 x 105? m72. This 
mass energy density is about four protons per cubic meter, 
which amounts to Pyacuum = 5.9 x 10-7 kg/m? . 


From a theoretical point of view, the conclusion is that the 
De Sitter solution, which was discarded for a long time 
as physically irrelevant, has made a glamorous comeback, 
and presently plays a vital role in understanding the deep 
past, as well as the present and future of our universe. 
Remarkable! 


Figure 1.2.22: Magritte: the pilgrim (1966). My title for this 
intriguing surrealistic painting would be: ‘Let’s face the void, and 
void the face. If that isn’t a deep thought, then neither is its 
negation! (Source: ©’Photothéque Magritte / Adagp Images, Paris) 


Much ado about nothing. The 
handicap of generalists is that they 
know virtually nothing about almost 
everything, and the handicap of nerds 
is that they know virtually everything about almost 
nothing. What? Knowing everything about nothing? 
| wish it were true. Closer inspection shows that 
the science in-crowd knows little to nothing about 
nothing. Scientists remain silent, but spend sleep- 
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less nights worrying about nothingness. Imagine 
some DOE Innovation Initiative inspection team 
performing a lab raid and asking what you are 
doing with all that taxpayers’ money, and you would 
have to answer that you are working on ‘nothing’ 
Oh yes, you are just mucking around, are you? 
That would undoubtedly result in you taking a deep 
dive in the cool lake of depression. Career-wise | 
would avoid talking about the void. 


You would think that empty space — the vacuum, 
the void, nothingness — is a trivial no-brainer not 
worth pondering about. Note however, the following 
important remark that the legendary physicist 
John Archibald Wheeler made at some point: ‘No 
point is more central than this, that empty space 
is not empty. It is the seat of the most violent 
physics. Here, a deep truth appears to be lurking. 
A quantum truth. 


The reason we don’t need to talk about it in physics 
is because in real experiments we are always 
dealing with energy differences. We compare 
energies and the energy of the vacuum ‘drops out. 
We therefore can set it equal to zero if we would 
like to do so. This is fortunate, because we do not 
know how to calculate the vacuum energy from 
first principles, and all ‘serious’ efforts to do so 
typically give infinity as an answer. This means 
that the void is challenging our deepest scientific 
intuitions. General relativity is a comrade in arms, 
because as we saw, it is sensitive to something 
that other theories would not detect. Space-time 
itself allows for an absolute measurement of the 
energy including that of the vacuum. And moreover, 
space-time measurements have just told us that 
the vacuum energy is not just non-zero, it is in 
fact the dominant form of energy in our universe! 


This much is certain, ‘nothing’ does not exist and 
the notion of nothingness is an apparent delusion. 
What does exist is our ignorance about it. 


So, what’s so tricky about nothing? An average fish 
would reply: ‘Well, no fellow-fish, no water-plants, 
no play-rock and no gravel on the bottom. But what 
the average fish would never say is: ‘no water. The 
fact that nothing is something in which he couldn't 
exist doesn’t enter his fishy head. The average per- 
son by now understands damn well that without air 
he is going to choke, but apparently in nineteenth 
century educational institutions, that simple fact still 
had to be demonstrated by putting a little bird under 
a glass bell and pumping out the air. Just to prove 
that ‘nothing’ can also be quite harmful. Causing 
all sorts of panic because of the ‘unbearable heavi- 
ness of not-being.’ ‘To be or not to be,’ remains the 
question. Having answered that, ‘to understand or 
not to understand’ is the next question. 


The physics of geometry 


With the advent of the theories of relativity and gauge the- 
ories for the description of the fundamental forces, a new 
golden age for geometry in the realm of physics emerged. 
This section on ‘the physics of geometry’ will give you an 
introduction to the basic notions of geometry that have 
played a crucial role in modern physics. 


We will talk about the notion of curved spaces (smooth 
manifolds) and which concepts are essential if one wants 
to do physics on and with them. Some aspects of topology 
are mentioned like homotopy, because it leads to an alter- 
native way of understanding why certain physical quanti- 
ties turn out to be quantized and conserved. 
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Introducing regular coordinates on curved spaces often 
forces us to define overlapping patches or charts. One 
may introduce vectors that live in the tangent space of 
some point, whereas the collection of the base manifold 
and all its tangent spaces form the so-called tangent bun- 
dle of the space. Now we may study the transport of vec- 
tors along paths. This leads to the notion of holonomy 
which is linked to the integrated curvature over a region of 
space. We complete this lightning review by summarizing 
the relations between the metric, which provides a local 
sense of distance and size, a connection which connects 
vectors in different points, and a local curvature which is 
very much like the field strength in a gauge theory. 


After this pure geometrical part we show the close relation- 
ship of gauge theories, describing the non-gravitational in- 
teractions, with fiber bundles. This provides a geometrical 
understanding of the gauge potential or connection A,, and 
a gauge invariant field strength or gauge curvature Fuy . 
This representation of gauge theories opens the door for 
understanding their topological features, like the existence 
of magnetic fluxes or monopoles, and more generally to 
the notion of topological charges and quantum numbers. 
This section aims to highlight the intimate relationship be- 
tween physics, geometry and quantum theory. 


Living in flatland. When you talk about a space, most 
people have the natural inclination to think of a flat Euclid- 
ean space, like the plane denoted by R*. And on a plane 
life is simple not only for the Danes and the Dutch, but 
also for physicists. It is simple to choose a coordinate grid 
to label the points in the space. The shortest distance be- 
tween points are straight lines, and to define vectors like 
momenta and the forces of electric fields is also simple. 
You just draw arrows based at a point in the space be- 
cause a flat Euclidean space is also a vector space. There 
is no distinction; you may think of vectors as living in the 
‘same’ space as you. On flat Euclidean space we can de- 
fine functions and derivatives of functions (basically vec- 
tors), as well as their integrals. If we have a particle mov- 


(a) Triangle in flat space, sum of angles is 180° . 


(b) Parallel transport of a vector in the plane. 


Figure 1.2.23: Carrying vectors around. Parallel transporting 
around a closed loop in a flat space has no net effect on the 
orientation of the vector. 


ing on the plane in a potential U(x) then it will experience 
a force F(x) = —V U(x) which corresponds to a field of 
vectors over the plane. In other words we can do calculus, 


80 CHAPTER 1.2. THE AGE OF GEOMETRY, INFORMATION AND QUANTUM 


and therefore flat space is the basic example of a space or 
manifold that is differentiable. 


Parallel transport of vectors. In Figure 1.2.23 we show a 
Master Chef and two of his branch managers running an- 
nexes in other parts of town. He wants to send a secret 
recipe around in the form of a vector, and it is in the orien- 
tation of the vector in which the subtle balance of spices 
that earned the Chef his Michelin star is encoded. So, it is 
crucially important to preserve the direction the direction in 
which the vector points, implying that the Chef cannot send 
the recipe by mail. He decides to hire a messenger, an 
apprentice so to say, in a grey suit and with a leather brief- 
case to carry around the vector. The messenger should 
take care to parallel transport the vector. This is not hard: 
while moving along the shortest route consisting of straight 
line segments, he has to ensure that the angle between the 
vector and the direction of his motion (the path) stays the 
same. The Chef has ordered him to pass by again at the 
end of the trip, so he can check whether he did the parallel 
transporting correctly. And as you see the apprentice suc- 
ceeded in perfectly performing his task, as is confirmed by 
the independent juror standing in the corner. 


In this subsection we have mentioned some features of 
flat space that are so natural that you wouldn't think of 
them as particularly interesting. However, what we will ex- 
plore in the remainder of this section is that in a curved 
space, these simple concepts will become much more in- 
volved 


Curved spaces (manifolds) and topology 


Modern physics makes use of the mathematical knowl- 
edge about curved spaces or manifolds, both in the theory 
of relativity, but also in the theories that describe gauge 
interaction between elementary particles. What we want 
to introduce are what is known as differentiable manifolds, 


curved spaces that look locally like flat Euclidean space 
and therefore globally allow for defining functions and their 
derivatives (and vectors). These are spaces on which one 
can consistently define calculus, a necessary tool to de- 
scribe dynamical systems in such spaces. And that is what 
physicists like to do. We start by defining some elementary 
notions of topology , and then add the ingredients of differ- 
ential geometry like coordinate systems, vectors, metric 
and curvature. 


Positive and negative curvatures. It is easy to imagnine 
curvatures of a surface when we embed it in a higher di- 
mensional Euclidean space. The surface can be defined 
by an algebraic equation. 


Consider for example spheres S", these are defined by an 
equation x2 + y? +... = 1 in (n + 1)-dimensional Euclid- 
ean space E”+!, In Figure 1.2.24 we show the spheres S° 
(two points), S! and S2.These spaces are finite or compact. 
They can be obtained from one another by suitable rotation 
in the Euclidean space two dimensions higher. 


Spaces of negative curvature are for example hyperbolic 
spaces. In Figure 1.2.25 we depicted two hyperbolic sur- 
faces H? and the corresponding equations to contrast them 
with the two-sphere. One of the hyperboloids consists 
of two disconnected sheets while the other has only one 
sheet. These spaces are infinite. These two spaces can 
be generated by rotating a given hyperbola in the plane 
about an appropriate axis in that plane. The sphere is 
by definition the set of all points that have a fixed Eu- 
clidean distance to the origin. The double-sheeted hyper- 
boloid can be thought of as the set of all events in three-- 
dimensional Minkowski space, at a fixed space-time inter- 
val from the origin. The sheets also represent the three- 
dimensional energy momentum vectors (E, p) of a particle 
with finite rest mass m satisfying E? = m? + p°. 


Topological features. A topological feature or character- 
istic is one that doesn’t change under a continuous de- 
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Figure 1.2.24: Three spheres. This figure shows the spheres 
and their embedding equations. The ‘zero’-sphere S°, described 
by the equation x? = 1, consists of two points. The circle or one- 
sphere S! is defined by the equation x? + y? = 1, and S? by its 
natural higher-dimensional extension. 


formation of the space. Cutting and pasting the space is 
not allowed. Topology is like rubbersheet geometry where 
stretching and shrinking in any direction is allowed but tear- 
ing the sheet is not. Two spaces are topologically equiv- 
alent if they can be continuously transformed into each 
other. 


Boundaries and holes. Let us start with one-dimensional 
spaces like smooth curves. For example a line segment is 
smooth and has two point-like boundaries. But we could 
also consider a closed curve. It may look like a circle or 
the number zero which has no boundary, but it has a hole. 
The figure ‘eight’ is also closed (has no boundary) and it 
has two holes but now it is not a one-dimensional space 
because it has a singular point where the lines split and 
where it therefore is locally not like R!. 


Connectivity. The spaces just mentioned are all path-wise 
connected, meaning that for any two points one chooses 


Figure 1.2.25: Hyperbolic planes. This figure shows two dif- 
ferent two-dimensional hyperbolic spaces and their embedding 
equations. They are closely related to the equation for the two- 
sphere, and differ by additional minus signs. The sphere has 
positive curvature, while the hyperboloids have negative curva- 
tures. 


there is a path connecting them. The number ‘I0’ as a 
space is not connected it has two disconnected compo- 
nents: the T and the ‘0’. One open component ‘I’ has two 
point-like boundaries, and the closed component has no 
boundary. If we consider any two points on the line seg- 
ment then these can be connected by some path, and all 
paths can be continuously deformed into each other. Tak- 
ing two points on the ‘0’ or a circle, we find that there are 
many possible paths that connect the two points. These 
paths may wind an arbitrary number of times around the 
hole and such paths cannot all be continuously deformed 
into each other. We say that the ‘I’ is simply connected 
whereas the figure ‘0’ is multiply connected because there 
are topologically distinct paths. So we are invited to fur- 
ther refine the notion of connectivity. Let us do that after 
we have moved the discussion one dimension up and con- 
sider smooth two-dimensional surfaces. 


82 CHAPTER 1.2. THE AGE OF GEOMETRY, INFORMATION AND QUANTUM 


The simplest finite, two-dimensional spaces have the topol- 
ogy of a disc. It is simply connected and it has one bound- 
ary with the topology of a circle. Note that a boundary in 
two dimensions in general is a disconnected union of one- 
dimensional closed curves, which are topologically speak- 
ing circles. If the boundary has more than one component, 
the space becomes multiply connected. 


To imagine a curved space one may for instance think of 
the two-dimensional surface of a sphere or torus as em- 
bedded in a three-dimensional flat Euclidean space R? . 
If you look at a small neighborhood of any point in these 
curved spaces S? or T?, you see that locally, it is every- 
where like the flat space R?. 


It is only after you enlarge your horizon that it becomes 
clear that the sphere and the torus are quite different glob- 
ally from flat space and from each other. Indeed the study 
of curved spaces descended on us with the insight that the 
earth turned out to be not flat. Both are globally compact 
meaning that they are finite: it takes a finite amount of paint 
to cover the two-sphere for example, whereas flat space 
is infinite and non-compact. Similarly a three-dimensional 
sphere would have a finite volume. 


Spheres and tori are finite spaces, and they also have 
the property that they have no boundary. Indeed curved 
spaces can be finite and not have a boundary. Yet, they do 
have a different topology; for example the two-torus has 
a hole in it while the two-sphere has not. This means that 
the connectivity properties will differ and this in turn implies 
that the physics in the one space may exhibit features dif- 
ferent from the other. 


For two-dimensional manifolds without boundaries (also 
called closed Riemann surfaces) the number of holes is 
the only topological invariant characterizing the manifold. 
A pretzel is therefore distinct from a donut, not only qua 
substance and taste but also topologically, as it has two 
holes. 


a 


"Eo 


Figure 1.2.26: The pretzel-transformation. This figure shows 
in clockwise steps how a self-linked pretzel (top left) can be 
smoothly unlinked (left bottom). It is a nice example of a topo- 
logical deformation as prsented by Martin Gardner in his book 
(1987) on mathematical recreations. 


In spite of the pretzel’s simplicity its topology is surprisingly 
counter-intuitive as we have illustrated in Figure 1.2.26. 
One can clearly imagine the left and right parts of the pret- 
zel to be interlinked like the real pretzel in the center and 
schematically depicted at the top on the left. It appears 
like yet another two-dimensional closed surface which is 
topologically distinct with some two holes and a half! Is it 
really? The answer is: No! There is a well-known smooth 
topological deformation that corresponds to a smooth un- 
linking of the pretzel. In the figure we depicted the subse- 
quent steps in the so-called pretzel-transformation which 
shows how a self-linked pretzel can smoothly be unlinked. 
| always imagined that this somehow must be of use if you 
end up in the unfortunate situation of being handcuffed 
for some reason, for example because of stealing pret- 
zels! 


Homotopy. An important topological characteristic is de- 
noted as the connectivity of a space, which can be probed 
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Figure 1.2.27: The two-sphere is simply connected. If we take 
a point and consider closed loops starting and ending in that 
point, all possible loops can be smoothly deformed into each 
other, and all loops are contractable to the point. There is only 
one trivial homotopy class, denoted by [0]. 


by studying maps of closed paths or loops into that space. 
The loops that can be continuously deformed into each 
other are called homotopic. Homotopy is an equivalence 
relation. Two loops are either homotopic or not. Having 
such a relation allows you to divide the space of all maps of 
loops into distinct classes, homotopy classes in this case. 
For example if you draw a closed loop on a sphere, this 
loop can always be smoothly contracted to a point. The 
popular wording of this fact is that ‘You cannot lasso a 
basketball.’ From Figure 1.2.27, we see that all loops on 
the sphere can indeed be deformed into each other and 
can smoothly be contracted to a point, so there is only one 
homotopy class, the trivial class denoted by [0] . 


However, if you look at closed curves on a torus, then there 
are many possibilities. There are loops that can be simply 
contracted to a point, then there are loops that wind around 
the big hole, like the big circle on the outside, or closed 
curves that wind around that hole an arbitrary number of 


[1,0] 


Figure 1.2.28: The two-torus is multiply connected. We have 
depicted three loops through a point. The yellow one is con- 
tractable and belongs to the trivial class [0, 0]. The green (red) 
one winds once around the large (small) hole and is non- 
contractable, and it belongs to the class [0, 1] ([1, 0]). 


times. There are also loops ‘perpendicular’ to the previ- 
ous ones, going around the small hole a certain number of 
times. We have illustrated the situation for the torus in Fig- 
ure 1.2.28, where we have drawn three examples. The yel- 
low loop is contractable and therefore belongs to the trivial 
class which we denote by [0, 0] . The green loop encircles 
the large hole once: it is non-contractable and belongs to 
the class [0, 1] . The red loop encircles the small hole once, 
and cannot be deformed to either of the other two, since 
it belongs to another class [1,0]. In general a loop that 
winds m times around the small hole, and n times around 
the big hole belongs to the class [m,n]. Think for exam- 
ple of a hiking boot as a closed two-dimensional surface: it 
may have ten holes for the shoe lace to go through. When 
| have tied the knot | should have created a closed loop 
belonging to a non-trivial class. 


Having defined and labeled these classes in a systematic 
way we can go one step further and ask if additional prop- 
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erties can be assigned to them. The first thing that comes 
to mind is: can we assign an orientation or direction to 
them. This can be done by putting an arrow on them, and 
this allows you to assign negative winding numbers. 


The first homotopy group. A nice property of closed paths 
is that we can compose them by connecting the end point 
of the first loop to the beginning of the second loop and so 
create a new closed path. This composition rule induces a 
map, or more precisely a multiplication rule for the homo- 
topy classes: [...] OL..J2 = L..]3 . So here we have a set of 
objects (a set whose elements are classes) that we know 
how to multiply but there is not such a thing as addition 
defined. This means that this set forms a discrete group 
where the unit element corresponds to the trivial class of 
contractable loops, while the inverse element is the class 
corresponding to the opposite winding number. This group 
is called the first homotopy group, or fundamental group 
and can be determined for any manifold. 


A question that may come to mind is: What does this have 
to do with physics? The answer is: quite a bit. In fact we 
have already seen examples of it. The notion comes up 
if you want to discuss line integrals of some field along a 
closed curves, as we did for example with the vector po- 
tential. The loop integral corresponded to the enclosed 
magnetic field, or the magnetic flux going through the loop. 
The group structure tells you how these magnetic fluxes 
‘add. And as it turns out these fluxes can have highly 
unexpected composition rules once one studies phases, 
not of electrodynamics, but of non-abelian gauge theories. 
These considerations have also important applications in 
the study of quantum interference effects and the quan- 
tum statistics of particles. These are topics we will get to 
in later chapters of the book. 


Higher homotopy groups. In higher dimensions there are 
more possibilities to consider. For one you may think of 
higher dimensional holes that correspond to non-contract- 
able maps of higher dimensional spheres into the man- 


ifold, and these in turn form higher dimensional homo- 
topy classes. So the second homotopy group tells you 
how many topologically inequivalent ways there are to map 
a two-sphere (a closed two-dimensional surface) into the 
manifold and how those maps can be composed. Finally, 
the zero-dimensional homotopy classes label the discon- 
nected components the space under consideration. 


Coordinate systems. You may wonder why it took so long 
for mankind to figure out that the Earth’s surface we are 
living on is a space that is not flat but curved. The reason 
is that on a local scale the world is basically flat and our 
naive expectations work well as long as you stay nearby. 
So, if we live in a curved space it has to be a space that 
is locally like Euclidean space. A space that is everywhere 
‘locally flat’ is a space that we call smooth because we 
can systematically extend the whole mathematical appa- 
ratus of calculus concerning differentiation and integration 
of functions which we originally defined on flat space. So 
we may expect to be able to give adapted mathematical 
descriptions of the physical laws if we move from flat to 
curved spaces, as relativity tells us to do. 


Euclidean space and coordinates. The first question that 
arises is to choose coordinates on the space. The choice 
of coordinates are often naturally suggested by the sym- 
metries of the space. You could think that the symmetries 
generate the whole space from a single point, an ‘origin’ 
For example, flat space has translation symmetry, we can 
move from any point to any other by performing a transla- 
tion. R" has n independent orthogonal directions in which 
a given point can be moved, and the natural choice for 
coordinates is therefore the Euclidean coordinate system 
1X45 X2; <- +, Xn}. 


Curvilinear coordinates. But nobody forces us to use that 
coordinate system. In fact as soon as we start consider- 
ing a particular setting in that space, for example, we may 
single out a particular point as the center of our space. 
Think of how we used to put the Earth in the center, and 
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Figure 1.2.29: Spherical coordinates. The definition of spheri- 
cal coordinates in R?, denoted by r,0, and ọ. 


after the Copernican revolution we put the Sun in the cen- 
ter and so on. Fixing one point as special, we break the 
translational symmetry, but still have the rotational sym- 
metry which leaves that point fixed. And that symmetry 
naturally suggests another choice of coordinates, the sys- 
tem of spherical coordinates {r, 8, @}, as depicted in Fig- 
ure 1.2.29 . It is as if we think of R3 , as built up of a point 
plus a continuous stack of concentric spheres around it. 
Similarly we may want to use cylindrical symmetry with 
coordinates {p,z,@}, where we think of the space built 
up from a line with a continuous stack of concentric cylin- 
ders around it. And we have already seen that in many 
physical applications such orthogonal curvilinear coordi- 
nate systems are much more convenient; they lead to a 
convenient framing of the problem that makes it easier to 
obtain solutions. For example, if | have a current through 
a straight wire along the z—axis like in Figure 1.1.18, the 
problem becomes cylindrically symmetric, and the mag- 
netic field B(x) will have only a ~-component that will only 
depend on the radial coordinate: B = By(p) ê? . 


Spherical coordinates. The observation that we think of 


spaces generated by symmetries is useful if one wants to 
think of typical curved spaces which exhibit those symme- 
tries. Indeed if we think of the three-dimensional rotations 
just mentioned, and we take an arbitrary point in R? , that 
point will indeed trace out a two-sphere, which is a highly 
symmetric two-dimensional space. So if we ‘throw away’ 
the radial coordinate, we are left with an orthogonal coor- 
dinate system on the sphere consisting of the two angles, 
the polar angle © , running from O to 7 and the azimuthal 
angle running from 0 to 271, as we have been using all 
along. Do these coordinates cover the sphere well? Not 
really, it turns out. 


Coordinate singularities. The north and the south pole are 
clearly problematic. In these points the coordinate system 
breaks down, whereas the © coordinate is well defined, 
the ọ angle is not. There is no sensible way to assign a 
definite ~ angle to the poles. Note that the real geometry 
of the sphere is completely smooth at those points. The 
poles are regular points just like any other point on the 
sphere. The problem is not the space, but the coordinate 
system we have chosen. To solve this coordinate problem 
in general one first has to accept that it is not possible to 
choose a single coordinate system that covers the whole 
sphere without singularity. There is a topological obstruc- 
tion to doing that following on from the hairy ball theorem. 
This theorem states the easy to imagine fact that it is not 
possible to comb a hairy sphere. Just try doing it and you 
will quickly find out that there is always a point in which the 
hairs meet in opposite directions. This means that there is 
no globally defined, non-zero tangent vector field, or con- 
versely, that any vector field on a sphere has to vanish at 
least in one point. And that fact implies that we cannot 
have a single globally defined coordinate frame of orthog- 
onal unit vectors on the two-sphere. 


Patches or charts. Knowing this fact, the best we can do is 
to cover the sphere by defining two coordinate patches or 
charts, that together cover the sphere and have an over- 
lap so that we can identify points on the two maps that 
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Figure 1.2.30: Two coordinate patches. The two-sphere is cov- 
ered by two overlapping coordinate patches S+ to avoid coor- 
dinate singularities. There is a transition map which identifies 
points (and tangent vectors) in the overlap region. In this case 
we have », = ọ and p_ = —ọ . and 0_ = 0, = 9. 


correspond to the same points on the sphere. For exam- 
ple, one may define one patch covering a little more than 
northern hemisphere and call it S , and the other a little 
more than the southern hemisphere and call it S_ , as in- 
dicated in Figure 1.2.30. The overlap between the patches 
is then a narrow band with the equator in the middle. Each 
patch has the topology of a disc and so we can put a reg- 
ular coordinate grid on it. Now we can define a transition 
map on the overlap, which provides a map between the 
coordinate systems in both patches. And this map should 
be smooth as well. At this point we have succeeded in 
making an atlas of the sphere consisting of two charts, 
each of which can be smoothly mapped onto a flat page 
by a stereographic projection, which you may have en- 
countered in high school geography classes. This is the 
way to deal with the complications that arise in defining 
coordinates on a sphere, and this allows us to globally de- 
fine smooth functions and their derivatives, to define paths 
and vectors, all the things physicists and mathematicians 


need and love. With this construction we have shown that 
the two-sphere also is a differentiable manifold, a curved 
space where we can do calculus. A differentiable manifold 
is a space that is locally like Euclidean space, and globally 
looks like a smooth patchwork of pieces that are much like 
flat space, sewn together in a consistent way by a network 
of smooth transition (sewing or gluing) functions. 


Distance and path length. So far we have talked about 
topological characteristics of manifolds but that leaves the 
important aspect of form and scale undetermined. How 
long do | have to walk to get from A to B, that’s the ques- 
tion! Mathematicians like to say that to settle it we have 
to add more structure to the space. In order to introduce 
the concept of size or distance we have to define a metric 
on the space. In flat space we know that the shortest dis- 
tance between two points corresponds to the straight line 
between them. And the distance is calculated by apply- 
ing the Pythagorean theorem. If we consider any smooth 
path in flat space, we can calculate its length by succes- 
sively applying the theorem to infinitesimal segments of 
the path and adding the results. If the points are nearby, 
we have for the distance ds, that ds? = dx? + dy?. If 
we start by defining a smooth path as a one-parameter 
curve y(t) = {xu(t)}, the tangent vector to the curve at 
the point x(to) = y(to) is just like the ‘velocity’ vector 
v(t) = dx/dt|,.. The length La» of the curve between 
two points y(a) and y(b) is now quite naturally defined as 
the integral: 


b 
i= | Iv(t)|dt. (1.2.15) 


a 
in a different, more familiar wording, the distance traveled 
is just the magnitude of the velocity integrated along the 
path over the appropriate time interval. It is this notion of 
path length that we like to generalize to curved spaces. 


Metric and line element. To calculate the path length in a 
curved space we need a local definition for the infinites- 
imal distance ds which specifies the local (x-dependent) 
definition of an infinitesimal length. Once we have chosen 
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Figure 1.2.31: Shortest distance. The shortest path between 
two points A and B on the sphere is given by the segment of 
the yellow ‘great circle’ which is defined as the intersection of 
the plane through A , B , and the center of the sphere C, and the 
surface of the sphere. The blue and red circles are smaller and 
yet yield longer paths between the points A and B. 


coordinates {xi} on the space, where i runs from 1 to d, the 
dimension of space, then we formally define the so-called 
line element ds as: 


ds” = gi;(x)dx' dx) (1.2.16) 
where the symmetric matrix g(x) is the so-called metric 
tensor. In flat space we saw that ds? = dx? + dy? and 
the metric thus corresponds to the unit matrix everywhere. 
If we choose polar coordinates r and ọ, the line element 
would be ds? = dr? + r7d@?, and the metric a diagonal 
matrix (1,12). 


If we put a symmetric mass distribution around the origin, 
then the space would be curved as we illustrated in Fig- 
ure 1.2.4, where the two-dimensional surface embedded 
in R? would be defined by fixing z = f(r) with a func- 
tion f interpolating between some constant f(0) = —a 
and f(r — oo) = 0. The radial length measured along 


the suface will now change, and indeed the metric would 
change in that grr = 1 + (df/dr)*. The metric on the two- 
dimensional surface is induced from the trivial metric on 
R? by substituting dz = (df/dr) dr. 


It is important to realize that in principle there are many 
possible choices for the metric on a manifold, the only re- 
striction being that it is smooth and compatible with the 
topology of the space. These choices lead to different ge- 
ometries in the sense of distances, geodesics etc. In the 
case of the S* example we can make the ‘natural’ choice 
of metric as we did just before by inducing it from the stan- 
dard everyday metric in the space R? in which we have 
embedded the two-sphere. Squashing the sphere would 
naturally change the metric. What makes that choice nat- 
ural is that our visual intuitions on vectors and path-length 
and angles still make complete sense. 


Shortest distances: geodesics. We now are in a position 
to answer the question of what the shortest path between 
two points on a sphere is. That will again be the path along 
which photons and free particles living on S* would travel. 
Just like the route your child would presumably take on 
their way to the nearest two-dimensional ice-cream par- 
lour. We will answer this question in more detail later, but 
let us first get a feeling and an intuitive idea of the solution. 
In Figure 1.2.31 | have marked two points A and B on the 
surface, and drawn various paths between them, each of 
them corresponds to a segment of a circle on the surface. 
It is evident from the drawing that the bright yellow con- 
nection in the middle is the shortest, and it corresponds 
to a segment of the equator. The other paths are also 
segments of circles, but what sets the yellow one apart is 
that it is a segment of a ‘great circle; a circle of maximum 
size on the sphere whose radius equals the radius of the 
sphere itself. Great circles are defined as the intersection 
of a plane through the centre of the sphere (the point C in 
the figure) and the spherical surface. These great circles 
are so-called geodesics on the space S? , and correspond 
to what straight lines are on the plane, they correspond 
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to the trajectories of free particles like photons. Shortly 
we will discuss the equations that geodesics have to sat- 
isfy. 


Vectors on curved spaces. 


And the curved space said: ‘Vectors don’t live here 
anymore.’ 


Tangent space. Assuming that physicists are also living on 
that sphere, they need vectors to describe what's going on: 
momentum vectors, forces, electric fields, and also quan- 
tum states. On a sphere, those vectors cannot live ‘in the 
space’ itself, because the sphere is not a vector space. In 
a curved space the notion of a ‘position vector’ makes no 
longer sense. To stay close to our flat soace experience 
we do the following: to define a vector at some point on 
the sphere, we first construct the tangent plane to the sur- 
face at that point, and then put the vector there. Because 
the tangent space is a copy of R? we can add, subtract 
and take products of vectors there. So we attach a vec- 
tor space to every point on the sphere. If we have a well- 
defined system orthogonal curvilinear coordinates, then lo- 
cally, in any point x we can construct a set of orthonormal 
tangent vectors along the coordinate axes, and those de- 
fine a smooth (othonormal) frame at any point in the patch. 
Having a set of smooth transition functions allows us to ex- 
tend such frames over the whole manifold. 


Parallel transport of vectors. Knowing how to deal with 
vectors at every point in space is not enough. We want to 
compare vectors at different points, and we want to move 
them around. We need to ‘parallel transport’ the vectors or 
frames from one tangent space to another. The question 
we are now equipped to answer is: what happens if we 
do the exercise with the Master Chef we did in ‘flatland’ 
before? 


We put three people standing at the corners of a spherical 
triangle, then we draw the shortest paths between them 


and ask the apprentice, the messenger, to bring copies of 
the Chef’s vector around. What happens is depicted in the 
Figure 1.2.32. The instruction is the same, in the point on 
the geodesic we first construct the vector tangent to the 
curve, which lies in the tangent space of the point. The 
Chef's vector to be transported makes a well-defined angle 
with the tangent vector. Parallel transport is now defined 
by keeping the angle between these two vectors constant 
while moving forward along the geodesics. The result of 
carefully parallel transporting a vector along the triangle is 
depicted in Figure |.2.32(d). On the first segment of the tri- 
angle the angle is 0 , on the second segment it is 7/2 , and 
on the third it is 7. It seems to work fine, except that when 
the apprentice returns to the Chef, the boss is furious. It is 
not hard to see why. Comparing the initial and final, par- 
allel transported vector, we see that they are not parallel 
at all! The transported vector has rotated over an angle of 
m/2. The apprentice is shocked: how could this have hap- 
pened? He did after all perfectly follow the instructions all 
along, oh my! But the Chef is unrelenting: ‘You are fired! 
Out through the backdoor you fool!’ 


Holonomy. What we may learn from this mini-drama be- 
comes clear when we turn the story around. It is appar- 
ently simple to find out whether you live in a curved space, 
without stepping outside into the embedding space; it suf- 
fices to just walk around some closed paths and parallel 
transport a vector along with you, and see wether it is ro- 
tated upon return. So each closed path on the manifold 
induces a map of the tangent space onto itself which cor- 
responds to a rotation. This intrinsic property of a space is 
called holonomy and an important characteristic of curved 
spaces. For the example at hand we see that the vector is 
rotated by an amount that equals the solid angle bounded 
by the loop. The total area of a sphere is 47r? , so 4r for 
the unit sphere. And indeed, the triangle covers the area 
(or solid angle) of an octant which equals 47/8 = 7/2. 
It is easy to check that this solid angle-holonomy corre- 
spondence also holds for other simple closed paths. If we 
for example extend the triangle by moving the two points 
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(a) Rotations generate translations on the two-sphere S°. Triangle (b) Vectors at a point in a curved space live in the tangent space 
on curved space has more than 180° . (~ R?) at that point. 


(c) Carrying a frame around a triangle on a sphere. (d) Parallel transporting a vector along a closed loop on a sphere 
rotates the vector. 


Figure 1.2.32: The geometry underlying the tangent bundle of S* . Using the equivalence of S? with a line bundle over the two-sphere. 
Moving a point over S? is the same as carrying a tangent vector over S? . 


on the equator southwards to the South Pole, we obtain a curved space the holonomy equals the net curvature on 
non-trivial two-angle (!). Going around this two-angle will any surface bounded by the loop, after proper normaliza- 
yield a holonomy of 7 as it should. A more general way tion. As the (scalar) curvature of a sphere is a constant 
to state the result is to say that for any closed loop in any that equals 2, the curvature enclosed in the loop is then 
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Figure 1.2.33: An optical fiber bundle. A bundle of optical fibers, 
each like a finite light ray, isomorphic to the unit interval: F = 
[0,1]. The base manifold M is just a finite disk with a circular 
boundary. Every fiber can be projected down to a point in the 
base manifold. This bundle is ‘trivial’ in the sense that the bundle 
space is just global product, E = M x F. 


equal to two times the solid angle. 


Fiber bundles. In a proper description of many physical 
systems the most basic ingredients are some space(-time) 
manifold M that may or may not be curved, and there will 
be certain physical variables like temperature, a fluid flow, 
or whatever one is interested in. We assign functions (or 
fields) to those variables. For the temperature we define a 
function T = T(x) which is a map from the base manifold 
to the real numbers: T : M — R. For the velocity field 
this would be a vector field (also called a vector-valued 
function) v(x) which you can think of as a map v : M —> 
R3. 


Let us now introduce an upgraded setting for the previous 
paragraph, and start with a big space € = M x R". So 
the space looks very much like a bundle of fibers F = R” 
because above any point x € M we have erected a copy 
of the fiber. Now a function on M which takes its values in 


IR" can be viewed as taking a cross-section of the bundle. 
In other words, giving S(x) corresponds to drawing some 
curved surface above M that intersects with every fiber 
only once. Figure 1.2.33 gives an intuitive idea of such a 
fiber bundle. We start with a base manifold M, the physi- 
cal space. In this case the base manifold is a simple two- 
dimensional disc. Above each point of x € M we erect a 
fiber Fx which is isomorphic to the reference fiber (drawn 
on the left) and in this case is a finite ray of unit length. 
The fibers F, in € are transformed images of the refer- 
ence fiber. In the picture we also show local (x-dependent) 
map S(x) : M — F. Such a map S(x) is called a section 
of the bundle, indeed we obtain a deformed surface above 
M which is literally a cross-section of £ . In this particular 
example there is a smooth map from € — M x F from 
the bundle space to the global product of base and fiber, 
which means that the bundle is trivial. 


More generally, if we think of the base manifold M as the 
space or space-time manifold, then we usually define all 
kinds of fields f(x, t) on it. These fields often take values 
in some vector space VY or an algebra, meaning that we 
have a map f : M — F. A natural setting to describe both 
the space M and such a function on it is to extend the 
manifold to a fiber bundle € , which locally for any neigh- 
bourhood U; C M has a direct product structure U; x F. 
The point is now that globally this is not necessarily the 
case. It may be that a basis cannot be extended smoothly 
over all of M ; In such a situation the fiber bundle frame- 
work is very powerful and versatile. 


Let us illustrate this difference with another simple exam- 
ple. Consider the case where the base manifold M is a 
circle, M = S', and the fiber F the unit interval F = I = 
{0 < y < 1}. The ‘trivial bundle’ would be a cylinder, cor- 
responding to the global direct product € = S! x I. But 
we could also identify the fibers as (p = 0,y) ~ (9 = 
27,1 — y), in which case we get a Möbius band. This 
band has locally the same structure as the cylinder, which 
means that if you only were allowed to explore your direct 
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Mobius band 


Cylinder 


Figure 1.2.34: The Möbius band. We have depicted the trivial 
bundle over the circle with the unit interval as fiber, which is a 
cylinder. The tangent bundle of the circle corresponds to an infi- 
nite cylinder: € = S! x R. On the right we give the topologically 
non-trivial bundle corresponding to the Möbius band. 


neighbourhood, you would not be able to decide whether 
you lived on a cylinder or on a Möbius band. But glob- 
ally the situations are very different, walking along the in- 
side, you end up on the outside and vice versa. In other 
words there is no such thing as an inside or outside as 
they are smoothly connected. The Möbius band is a non- 
orientable manifold with a single boundary. We have il- 
lustrated the trivial cylinder and the non-trivial, ‘twisted’ 
Möbius band in Figure 1.2.34. The bundle picture allowed 
us to clearly set apart two spaces that are locally the same 
but globally (topologically) different. The cylinder is a two- 
dimensional flat space in that it needs only one coordinate 
patch, it has an inside and an outside separated by two 
one-dimensional boundaries. It is topologically like a disc 
with the origin taken out; it has no hole and two bound- 
aries. If you live on the inside and your relevant-other on 
the outside, than that is bad news because you cannot 
run into each other. Remarkably, on the Möbius band that 
problem is non-existent. 


This simple example gives a hint as to how natural and 
powerful the geometrical construct of a fiber bundle is. Ex- 
actly because for the base space and the fiber there are 
very many choices, and each of them gives rise to a sub- 
category of bundles with their own specific properties. You 
will find that there is a great variety: line bundles, vector 
bundles, principle bundles, tangent bundles, frame bun- 
dies, and many others. This world has vigorously been 
explored by the mathematicians, and they have developed 
a beautiful and rigorous framework in which many physical 
applications can be embedded. Books have been written 
about the subject and it is not our goal to delve too deeply 
into it, except to explore its relevance to the physics sub- 
jects treated in this book. 


Tangent bundles. I 
As we mentioned already, to have parallel transport and 
have a proper definition of distance on a curved manifold, 
we need extra ingredients. Having the coordinate patches 
with transition functions, we can draw continuous curves 
and parallel transport vectors. With these attributes we 
cannot only construct a tangent space at every point of our 
base manifold M , but we can also define what is called 
the tangent bundle of M . The tangent bundle is a smooth 
2n-dimensional manifold € , which consists of M and all 
its tangent spaces. It has the structure of a fiber bundle, 
because above every point x of the base manifold x € M 
of dimension n , we have erected a fiber F which is a copy 
of the tangent space R”. This bundle itself is a smooth 
manifold of dimension 2n. For flat space M = R" the 
bundle space would just be € = R” x R” = R*. And 
as we saw for the circle, the tangent bundle is just a two- 
dimensional (infinite) cylinder: € = S! x R, it is the global 
direct product and therefore a trivial bundle. 


The S* tangent bundle. The construction of the tangent 
bundle of S* is more complicated because it is topologi- 
cally non-trivial. The two-sphere and various local tangent 
planes are sketched in Figure 1.2.35, where we have also 
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@,_(A)= P(A), (A) 


Figure 1.2.35: Tangent bundle of S2. The base manifold is the 
two-sphere. The fibers are just copies of R?. We have indicated 
the ortho-normal frames which are rotated with respect to a ref- 
erence frame on the left. The structure group of the otho-normal 
frame bundle is just the group of rotations in the two-dimensional 
plane, denoted as SO(2) . We have indicated the transition map 
from the tangent frames referring to the two patches in the point 
x=A. 


shown how each fiber is related to the standard plane by 
some map (0, ~). This map basically tells you how the 
basis for the fiber as vector space in each point on the 
sphere is rotated with respect to some reference frame. 
So © corresponds to the angle by which the frame is ro- 
tated. It means that in general in this construction of the 
tangent (or simpler: the related otho-normal frame bun- 
dle) there is always a rotation group involved. This map 
is smooth on each patch, and one obtains the transition 
function to go from the frame for S._ to one for S_ at a point 
x in the overlap, by applying the product map ©, (x) = 
@®_(x) ®,(x)~!. Now why is this bundle non-trivial? This 
is the question we turn to next. 


To find out whether the bundle is trivial we focus on the 
transition map or gluing function in the overlap region. The 
result is depicted in Figure 1.2.36. We start by choosing the 


Figure 1.2.36: Transition map of coordinates and frames. The 
two-sphere covered by two coordinate patches S+, We have 
parallel transported a vector from the North Pole along the light 
blue meridians in S and from the South Pole in S_ to points 
on the equator. Going around the equator we see that the white 
vectors rotate clockwise and the pink vectors anti-clockwise by 
an angle B+ = +@. This yields the transition function A = 


B+- B- = 2 . The topology of the tangent bundle is therefore 
non-trivial and has winding number m = 2. 


white vector (but think of it as a frame) on the North Pole 
and transport it along the meridians down to the equator, 
there the transported white vectors are found to make an 
angle 8.(@) = ọ with respect to the vector at ọ = 0 (par- 
allel transported along the equator to the tangent space 
at the same point). Subsequently we carry the vector at 
ọ = 0 on its meridian all the way south, resulting in the 
pink colored vector at the South Pole. And from there 
we transport that pink vector upward along all the merid- 
ians in S_ again to the equator, yielding the pink vectors 
making an angle p- = —@, with the pink vector trans- 
ported from @ = 0. What we have constructed is a glob- 
ally smooth section of the frame bundle. The frames in 
the overlap region (the equator) on the two patches differ, 
and are related by a local rotation in the respective tangent 
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planes. We see that the transition function corresponds to 
a transition angle A(@) , which satisfies A(@) = 2ọ . If we 
walk full-circle around on the equator once then the angle 
A(27t) = 47, has gone around twice. This means that the 
bundle is topologically non-trivial, because it is similar to 
the non-trivial twist of the Möbius band, but we here have 
a relative winding number m = 2. 


Let us lift this discussion to the n-dimensional spheres S”. 
The bundle is an example of a frame bundle, this bundle 
is linked to the group of rotations (denoted by SO(n)) that 
maps all possible ortho-normal frame choices for the tan- 
gent spaces R” into each other. We cover the sphere by 
two overlapping disc-like patches, then the overlap is a 
sphere S™—! times the interval. Then the transition function 
is a map from the overlap region into the group SO(n—1). 
If this map is contractible, meaning that its homotopy class 
is trivial, then the bundle is topologically trivial. 


For the two-sphere we had a transition map from the equa- 
tor with coordinate ọ to the frame group SO(2), which is 
also a circle, parametrized by the angle A. These classes 
of such maps are labeled by the elements of the first ho- 
motopy group 74 (SO(2)) = 7(S!') = Z, the integer n € 
Z is often called the winding number. This means that 
A(27) = 27m and for the frame bundle of the two-sphere 
we found n = 2. This winding number is a topological in- 
variant that characterizes the bundle in question. We can 
now also answer the corresponding question for the three- 
sphere, this boils down to a mapping of the two-sphere 
(the ‘equator’) into the group SO(3) of three-frames. The 
homotopy group in question, 7t127(SO(3)) = 0. So the group 
has only one element, which means that all the maps are 
contractible from which we conclude that this frame bun- 
dle is trivial. And this in turn means that the three-sphere 
is ‘parallelizable: 


There is one other observation we want to make, which 
links this frame bundle of the two-sphere to the bundle that 


we studied in connection with the Dirac magnetic monopole. 


Let us recall that for the monopole we basically dealt with 
two vector potentials A+ defined on two patches on the 
two-sphere, linked by a gauge transformation.° In other 
words we considered a map from the equator (S!) into the 
gauge group of electrodynamics (which is the group phase 
group U(1) which also corresponds to a circle: U(1) ~ 
SO(2) ~ S!). We gave the explicit formula for that map 
A(@) = @ in equation (1.1.57), meaning that the winding 
number for the monopole bundle equals n = 1. 


The bundle space € in the monopole case corresponds to 
the manifold S? , interpreted as a S! or phase bundle over 
S*. The bundle is exactly the one described by Hopf in 
1931. As we will point out in Chapter 11.1, also the quan- 
tum state space of a single qubit corresponds to such a 
three sphere. 


What we have learned is that the monopole and frame bun- 
dle are both realizations of a circle bundle over the two- 
sphere, but they are topologically distinct because they 
have winding numbers equal one and two respectively. The 
bundles with higher winding numbers correspond for ex- 
ample to multiply charged Dirac monopoles satisfying eg = 
27m. But the most gratifying is perhaps that in spite of their 
quite different physical origins these two situations could 
be related within the framework of fiber bundles. 


Differential geometry. SS 
In this section we have demonstrated that for most physics 
applications which involve geometry we need extra struc- 
ture on the manifold which allows us to properly define 
functions and their derivatives or integrals, and of vectors 
and vector fields. The structural ingredients we need are 
a metric, a connection or covarariant derivative, and a def- 
inition of the curvature tensor. This takes us to the basic 
definitions of Riemannian or more generally of differential 
geometry. 


5We talk about the concentric spherical shells for a fixed radius 
larger than zero. 
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Figure 1.2.37: The geometry of the sphere. The yellow point 
has spherical coordinates (r,8,@). The length of the equator 
(red circle in xy-plane) on the sphere equals 27r . The red and 
blue arcs therefore have equal lengths s» = s» = r0; with angle 
expressed in radians (271 radians = 360°) . The green-colored 
segment of a horizontal spherical disc with radius a = rsin®. 
For the length of the green arc follows sg = ọr sin 0. 


Metric. We introduced the metric with the definition of the 
line element in equation (1.2.16), which in general reads: 


ds? = guydx" dx” (1.2.17) 


The metric is a symmetric ‘tensor’ with two indices which 
you can think of as a matrix guy depending on x, 


The metric also gives a local definition of the length |v| of a 
vector v! in the tangent space at a point x as follows: 


W? =v- v = guv" v” = vav", (1.2.18) 


where in the expressions we have adopted the standard 
convenient ‘Einstein convention, which says that if in any 
expression with repeated upper- and lower indices, these 
are automatically summed over, so, vv" = }_„ vv". 


For example on a two-sphere with radius r with coordi- 
nates (8, @) the standard metric has two non-vanishing 


components: gog = 7? and gow = T? sin? 0 . The line ele- 
ment ds follows from: 


ds? = guydxt'dx” =1°(d0?+sin?@d@*) (1.2.19) 
Looking at Figure 1.2.37 it is not hard to see why. You 
may verify that, (i) for @ fixed the arc or path length on 
the sphere corresponding to an angular displacement dé , 
corresponds to r d9 (as © runs along a big circle), (ii) for 
fixed @, the path length corresponding to the angular dis- 
placement dọ, equals rsin dọ (as the variable runs 
along a ‘lateral’ circle with radius r sin 0). What this means 
is the following: if we change the coordinate for an arbi- 
trary point on the sphere by an infinitesimal amount dọ 
then the length of the corresponding displacement vec- 
tor is ds = rsin@d@. So the metric tensor locally links 
infinitesimal changes in coordinates to infinitesimal path 
lengths in the space. 


Path length. The length Lap of the curve y(t) paramet- 
rized by t, between two points y(a) and y(b) is naturally 
defined as the integral: 


b 
tae | Iv(t)lat, 


a 


(1.2.20) 


in a different more familiar wording, the distance traveled is 
just the magnitude of the velocity component along the tra- 
jectory integrated over the appropriate time interval. So for 
example if we choose the lateral green circle (with 8 con- 
stant) in Figure 1.2.37 , we would have y(t) = {0, p(t)}: 


L = | rsineSPat =r(9(b) ~ (a))sind, (1.2.21) 
as it should. 


Frames. We like to mention that there is a slightly different 
formulation for dealing with Riemannian geometry due to 
Cartan. This formulation is close to the standard form in 
which gauge theories of the non-gravitational interactions 
are presented. We start by introducing an ortho-normal 
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basis or frame in the tangent space by writing: 


Juv = Nave ness (1.2.22) 


where Nap is the usual flat space metric (or inner-product), 
and the funny objects e, are the so-called solder forms or 
‘vielbeine’ that convert vectors from the curvilinear coor- 
dinate components to the ‘flat?’ components. So for the 
spherical surface we could simply choose ef = r and 
e? = rsin 0 and all others equal zero. These define what 
is called a local orthonormal frame {e°} = {eġ dx"}. 


With these definitions the inner product can be rewritten 
as: 


V: W = guy" w” = Napete? view’ =napvew?, (1.2.23) 
with the flat space vector components defined as v! = 
Cave 


Connection. Given the metric g or the frame {e} we de- 
fine the so-called metric connection w , which written in 
components would read w,,“,,, meaning that it is like a 
space(time) (row) vector and acts like a matrix in ‘a — b’ 
space. This connection is defined by the linear set of equa- 
tions:°: 


de+w^ e=0. (1.2.24) 


Specifying the metric, the metric connection or the set of 
‘vieloeine’ are equivalent characterizations of the manifold. 
Knowing the frame {e}, one can solve equation (1.2.24) for 
the connection in terms of the vielbeine and their deriva- 
tives. For the two-sphere the result for the connection is 
simply W's = —cos®. Note that it has two flat indices 
and therefore it acts like a matrix in flat soace. We intro- 
duce the connection w,, , because it is similar to the gauge 
potential A,, in gauge theory. The gauge transformations 
in the case of general relativity would correspond to /o- 
cal orthogonal (or Lorentz transfomations) rotations of the 


We use the quite compact index free notation because it makes the 
underlying structure more transparent. With indices the above equation 
would look quite daunting: d,.e$ — dved + wu, ey — wyn en =0. 


frame that leave the metric in other words the angles and 
lengths of vectors invariant, 


e't = Ote. (1.2.25) 


Curvature. To complete this lightning review of non-Euclid- 
ean or Riemannian geometry, we have to add a final ingre- 
dient, which is the Riemann curvature tensor or two-form 
R , which is the strict analogue of the ‘field strength’ F in 
gauge theories. It can be calculated from the connection 
as follows: 


R=dw+wAw. (1.2.26) 


This Riemann curvature is an object with four indices. We 
will refrain from descending any further in this myriad of 
indices except for at least giving the result for the two- 
sphere. There is basically only one component that is 
non-zero: R'2 = R!z12e' ^A e? = Se! A e?. From this 
Riemann curvature one finds the Gaussian curvature as 
Re = Rabab = 2/17. We say that the Gaussian curva- 
ture of the sphere is constant. It does not depend where 
you are on the sphere, it only depends on the radius of 
the sphere. If that radius becomes large the curvature 
tends to zero. In other words the space becomes effec- 
tively flat. 


The main point of this subsection is to show that the an- 
alytical structure of differential geometry is highly canon- 
ical. It involves three subsequent defining equations: it 
involves three subsequent defining equations: (i) for the 
metric (1.2.22) or the frame, (ii) for the connection in terms 
of the frame (1.2.24) and (iii) for the curvature in terms of 
the connection (1.2.26). Our aim is not to make any real 
computations but merely to get across that at this level of 
analysis it is evident that general relativity and gauge the- 
ories share an underlying geometric structure. Roughly 
stated, both involve a connection and a curvature defined 
in terms involving derivatives of the connection. 


The geodesic equation. Geodesics are the paths along 
which free particles move. We have asked what the short- 
est path between two points is on a sphere and found it 
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to be a segment of a great circle. In general that question 
can be answered by minimizing the path length La» under 
variations of the path. From the metric one can directly 
construct a free particle Lagrangian, which is a function of 
the coordinates and their time derivatives: 


dx" 1 dx!” dx” 
u _ 
cde dt? 29" at at 
ho We on AO 
1 
= zm (vp +v% ) (1.2.27) 


Minimizing the time integral of £ ~ |v (instead of Lap) one 
obtains the so-called Euler-Lagrange equations. On the 
two-sphere one obtains: 


d ,dée f dpy2 
=) — cos 0 sin O (r) =0, 
£ (sin20 $2) =0 (1.2.28) 


These are the Newtonian equations of motion for a parti- 
cle on a sphere in the absence of an external force as dis- 
cussed in the section on Newtonian mechanics in Chapter 
1.1. All terms have two time derivatives, among them are 
the pure ‘accelerations’ in the © and ọ directions. There 
is no potential as such and the extra terms that appear 
are a consequence of spherical geometry. So like in flat 
space, where a force would typically curve the orbit, and 
straight lines (describing shortest distances) are obtained 
by setting the force equal zero, something similar is true in 
curved spaces where free particles move along geodesics 
and they do independently of their mass or momentum. 


Let us check a few simple solutions. For example, if we 
assume that the velocity component in the ¢ direction, 
sin 8@d@~/dt vanishes, we obtain the solutions d@/dt = 
constant. These describe a particle moving with con- 
stant velocity along any meridian (where ~ = constant). 
This shows that the meridians are indeed shortest paths. 
Choosing d0/dt = 0, on the other hand, gives the solu- 
tion, © = 7/2, dg/dt = constant, which corresponds to 


the particle moving with constant velocity along the equa- 
tor, again a ‘big’ circle or geodesic. 


These calculations confirm our previous observations with 
respect to the Figures 1.2.31 and 1.2.32, where we saw that 
the shortest path between two points is always a segment 
of a great circle. That allowed us to also draw a triangle on 
the sphere as we did in Figure 1.2.32, and what we see is 
that the triangle has three 90° angles. In other words that 
the sum of the three angles of this triangle is 270° which is 
far more than the 180° of a planar triangle. It is amusing to 
note that if you move the two lower points of the triangle to 
the South Pole, you get a non-trivial ‘two-angle. 


Let us finally note also that all shortest paths from the 
North to the South Pole, in other words all meridians, are 
in fact ‘parallel, because they all are perpendicular to the 
equator. Indeed, in a curved space ‘parallel lines’ may 
cross. Boy! Yet another reason why life on Earth is so com- 
plicated. My devise would be, be prepared: think global 
and act local, rather than the other way around. E E 


IS 


A gauge theory is the prototype model for all fundamental 
interactions, where the gauge field may either describe the 
electromagnetic field corresponding to the photon, or the 
fields mediating the strong interactions corresponding to 8 
gluons, or the weak interactions described by the W= and 
Z bosons, and finally it may describe general relativity cor- 
responding to the gravitational interaction mediated by the 
graviton. The gauge symmetry principle is therefore a fun- 
damental and universal hallmark of nature. Gauge invari- 
ance imposes a strong constraint on the system of fields 
involved. In particular it completely fixes how the force car- 
rying fields just mentioned interact with the ‘charge’ carry- 
ing fields or constituent particles like the the electon, the 
muon, the neutrinos or the quarks. On the other hand this 


The geometry of gauge invariance 
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physically based gauge principle is deeply linked to the 
geometry of fiber bundles of which the tangent bundle we 
discussed is only one example. 


The topic of gauge invariance will pop up in this book at 
regular intervals. Here we give some of the mathemati- 
cal background of the gauge principle which corresponds 
to the geometry of fiber bundles. In Chapter 1.4 on the 
quest for the basic building blocks of matter we discuss 
how the gauge theory approach has led to the standard 
model of the fundamental interactions between elemen- 
tary particles. And in Chapter II.6 on symmetries and their 
breaking we describe in more detail what the equations for 
gauge theories look like and how they implement the idea 
of a local gauge invariant dynamics. 


The charge degree of freedom. To get a better feeling 
for the notion of charge and its connection to gauge invari- 
ance it may help to explicitly introduce a model for electric 
charge. Think of particles carrying an extra periodic coor- 
dinate f£ that labels points in some ‘internal space’ that in 
this case corresponds to a tiny circle. You may think of B 
in that sense as an extra charge degree of freedom, and 
the charge q = ne as a kind of momentum in this internal 
electromagnetic dimension with coordinate B . The particle 
carries along a charge-phase factor 


falb, x) = e, (1.2.29) 


If we split this phase factor in its real and imaginary parts 
by writing it as cos nB + isin nf , we represent the phase 
factor as a little two-dimensional unit vector making an an- 
gle n§ with the real (horizontal) axis.” Note that if we vary 
B from O to 271, then the phase of the particle with charge 
number n changes by 27n , so the corresponding little vec- 
tor rotates n times as fast. 


You may say that the charge-number corresponds to the 


7If you are unfamiliar with the notion of a complex phase factor you 
might want to look at the Math Excursion on complex numbers at the 
end of Part Ill on page 630. 


‘momentum" in the £ direction because it is proportional to 
the beta derivative —iedgf, = qf, and as there is no f- 
dependent potential or force, the B-momentum (= charge) 
is conserved. The dynamics in beta-space is therefore en- 
tirely trivial and that is precisely why nobody talks about it 
in the first place. But it at least explains the terminology 
that charge corresponds to an internal degree of freedom. 
| think that it is also quite helpful for getting a better under- 
standing of the notion of gauge invariance. And moreover, 
if we would treat this little charge-degree of freedom as 
a quantum particle on a circle, the momentum (= charge) 
would be quantized as well. It would look like the Bohr 
model applied to the quantization of charge. 


Gauge transformations. The best way to think about (lo- 
cal) gauge transformation is as a position- and time- de- 
pendent rotation, not in real space but in some internal 
vector space, that is carried by each of the matter fields. 
To clarify this let me return to electromagnetism. Another 
way to look at the local charge-phase factor we introduced 
is that it is the phase of a field M(x) = fy(x)d(x) hav- 
ing a charge q = ne. In quantum theory a particle with 
charge q = ne is described by a complex wavefunction 
O(x) = fn(x)b(x), and f,(x) represents the local phase 
of that wavefunction and the function (x) its magnitude. 
Formally a gauge transformation acts on the fields (in fact 
on the phase factor) as follows: 

fn > fh =Unfa with: Up(A)=e™"™, (1.2.30) 
The transformation Un corresponds to a unitary represen- 
tation of the group U(1) labeled by the integer n € Z. It is 
unitary because U*U = U-!U = 1. And the gauge group 
of the theory is therefore naturally called U(1) , because a 
phase factor can be thought of as a (1 x 1) unitary matrix. 
If you prefer to talk about the little vector, then you should 
refer to the gauge group SO(2), the group of rotations in 
the two-dimensional plane, but that group is the same as 
(or isomorphic to) U(1) because its elements are also la- 
beled by an angle. 
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(b) Charge-phase factor fı in the gauge given 
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(c) Charge-phase factor fi in the gauge given 


(a) Charge-phase factor fı in the trivial gauge 


B(x,y) =0. by A=xy. 


by A = 2r cos(x + y). 
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(d) The gauge potential yields A = (1,x), 
yields a constant magnetic field in the z- 


direction. given above. 


(e) The vector potential after the transforma- 
tion U = et^ . With A’ = A — VA with the A 


(f) The vector potential in a gauge with A 
given above. 


Figure 1.2.38: Gauge transformations The effect of two different local gauge transformations on the phase factor et} œY) and on the 
gauge potential A = (Ax, Ay) . All describe the same uniform magnetic field that is directed out of the page to the front. 


So, properly speaking, the phase factor f,,(x) of the wave- 
function is an element of a one-dimensional complex, or 
two-dimensional real vector space Rep, on which the uni- 
tary representation of the gauge group U(1) with label n 
acts as a transformation. In brief, if | make a gauge trans- 
formation A(x), the phase factor of the field will trans- 
form by multiplication with a phase factor exp (inA(x)) 
and therefore its phase gets shifted by nA(x). And the 
gauge potential transforms like indicated in the formula 
(1.1.48). 


What gauge invariance means is that we are free to choose 


a frame of basis for the two-dimensional vector space in 
which unit charge vector f,({,x) lives, at every point x 
independently. Very much like the tangent spaces we dis- 
cussed before. In other words at any point x we have the 
choice of which point on the circle we call the origin to 
which we assign the value p = 0. Gauge invariance is 
the statement that the physics does not depend on that lo- 
cal choice. This implies that the physics doesn’t change if 
we shift B at each point x by an amount A(x). We have 
illustrated this in Figures 1.2.38 , where we have depicted 
both the phase R(x) and A(x) in three different gauges 
for a situation in two spatial dimensions. These images 
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underscore the great generality of local gauge transforma- 
tions, and in spite of looking so different, the magnetic field 
B = V x A is the same for all three configurations. It is a 
gauge invariant quantity and corresponds in this case to a 
uniform field in the positive z-direction. 


The reasoning above clarifies the use of the term ‘gaug- 
ing.’ If you think of a little pointer moving over a dial, then 
applying a gauge transformation amounts to redefining the 
label ‘zero’ on the dials located at different positions x. It 
is like calibrating the dials in all of space-time. 


Gauge covariant derivative. Let us now consider the fol- 
lowing question. | have a space with some electromag- 
netic fields (or potentials) in it, and | take a charge that sits 
at x = xo and move it slowly along some path y to an- 
other point x; . What will happen? | was careful enough to 
not disturb the fields, but as the charge interacts with the 
fields along the way, did something happen to that phase 
perhaps? Well, certainly, and what will happen is that the 
phase £ will change on its way to xı. By what amount 
does it change? And does that change depend on the 
path | choose? These are the questions that we turn to 
next. 


Let us take small steps at a time, or better even, infinitesi- 
mal steps! So, suppose we want to know what the charge- 
phase would look like at a nearby point, then we can make 
a linear approximation only keeping the first derivative: 

dfn 


fn(x’) = fa(x + Ax) œ fa (x) + Ax lxt <, (12.31) 


but this does not take care of the change in the frame in 
which the phase is expressed, by which | basically mean 
the orientation of the real and imaginary axes of f» at dif- 
ferent points x. That basis change is determined by the 
gauge connection A,,(x), which means that we have to 
replace the ordinary derivative with the so-called gauge 
covariant derivative: 


OD = E (1.2.32) 


The added gauge connection literally connects the frames 
in neighbouring points. It is not sufficient to just calcu- 
late the phase; in order to compare the phases at different 
points you need to know how the bases at those points are 
related. 


Why the term covariant derivative? It is like the deriva- 
tive in a co-moving frame, and therefore the appropriate 
term indeed. This becomes clear if we look at the how this 
derivative transforms under gauge transformations given 
that the field transforms as given in (1.2.30) and the poten- 
tial like that given in (1.1.48) as Ay — AG = Au — A. 
We find: 


Dafn > Dyfal! = (u +iqAi, fh = UnlD fn]. (1.2.33) 


which shows that this derivative transforms covariantly in- 
deed, in other words, just like the fn itself. This is an im- 
portant observation because one sees that quantities like 
IDifn|? and |Df,|? will be gauge invariant and these are 
terms that appear in the expression for the energy density 
of the field fa. And this in turn implies that to get an invari- 
ant energy the interaction between the charged field (or 
particle) has to be of a form involving the gauge-covariant 
derivative. That is what it means to say that gauge symme- 
try dictates the detailed nature of the interactions! 


Suppose we have a function h(x), imposing that dh/dx = 
0 implies that h = constant. Something similar can be 
defined for the covariant derivative. The equation for what 
is called a covariantly constant charge vector reads 


Dufn =0. (1.2.34) 


The solution to this equation amounts to expressing a path 
dependent relation between the phase at two points, cor- 
responding to the parallel transport of the charge-phase 
along a given curve. Let us look at this statement in more 
detail. 


The gauge connection. To carry the phase factor around 
we need a somewhat fancy expression involving the gauge 
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connection A. Let me recall the line integral I(-y; xo, x1) 
of the gauge field along a curve y given in(I.1.51) and 
depicted in Figure 1.1.26. To parallel transport the phase 
along some curve y going from xo to x;, we need to use 
not just the line integral I(-y; xo, x1), but rather its exponen- 
tial: 


Wa (Y; xo, x1) = ett ora) | (1.2.35) 


This path dependent phase factor carries exactly the frame 
from xo to x; so that: 


fr (B, x1) = Wn lY; Xo, x1) fn(B, xo). (1.2.36) 


This expression furnishes the general solution to the equa- 
tion (1.2.34) for the covariantly constant charge-phase fh . 
It transports the phase in a gauge covariant way, that is 
to say in such away that under a gauge transformation we 
have that 


Wa 3 Wh (Y; xo, x1) = Un(x1)Wn (Y; xo, x1) UL (xo), 


and this means that in the equation (1.2.36), the combined 
effect of a gauge transformation on all factors is the same 
on the left- and right-hand side. The net effect is a multi- 
plication by Un (x1) from the left, as it should be according 
to (1.2.30). So we have answered both questions: how the 
charge-phase will change and that it does so depending 
on the path chosen. The gauge ‘connector’ is nothing but 
the path dependent phase factor W,, . 


Gauge theory and principle fiber bundles. The central 
concept describing both the gauge potentials and the un- 
derlying space-time manifold M is called a principle bun- 
dle denoted by €, consisting of a space which locally can 
be thought of as a direct product of the gauge group G and 
the base manifold £: G x M. 


In Figure 1.2.39 we have given a simple example in which 
the base manifold is a circle M = S! parametrized by an 
angle ọ (the red circle). For the group we have chosen 
the group of rotations in the plane denoted by SO(2), pa- 
rametrized by an angle A making the group space also a 


Figure 1.2.39: A principle bundle associated with a gauge 
group G . Here the group is the phase group U(1), which is a 
circle (in brown) parametrized by 0 < A < 27. The base space 
M is also a circle (in red) with 0 < ọ < 27. Above each point 
in the base space we have a fiber that is a copy of the group 
G labeled by an angle A. Choosing a gauge corresponds to 
choosing a (cross) section of the bundle: a particular choice for 
A = A(@) as indicated on the right. 


circle G = S! (the purple circle). This group is the same as 
the phase group U(1), and as we saw this group is actually 
the gauge group of electrodynamics. Above each point of 
M we have a copy of G with an angle A = A(¢). 

The point is now that any electromagnetic field configu- 
ration corresponds to a particular bundle, and choosing 
to write down the configuration of the potentials explicitly 
we have to ‘choose a gauge’ which amounts to choosing 
a particular cross-section through the fibers specifying a 
particular choice of A = A(@). And this is for example 
done in the picture on the right-hand side. 


Charge carrying fields and associated bundles. Often the 
gauge field is called the gauge connection, because it con- 
nects local coordinate frames at different points with each 
other. In general a charge carrying field carries a repre- 
sentation of the gauge group and these correspond to so- 
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called associated bundles, where we have attached a copy 
of the representation space Rep of the group G, to every 
point of the base manifold in some smooth way. 


Returning to our previous example a field carrying a charge 
q = ne would described by a complex field say Wn(x), 
which has a magnitude and a charge-phase that again 
may depend on position, so, 


Wn(x) = exp(inB(x))p(x). 


Gauge transformations on such a field act as a local phase 
transformation; we multiply the field with the local phase 
factor Un (A(x)): 


Wnlx) > h(x) = Un(Alx)) vnlx). (1.2.37) 


Examples are depicted in the Figures 1.2.39, and 1.2.40, 
which give you an impression of the case where M ~ 
S!, g ~ U(1). The representations act on the fn(6(@)) 
and the representation space can be depicted by the lit- 
tle charge vector. In Figure 1.2.39 you see that the fiber 
corresponds to the orbit of the charge vector under rota- 
tions, and a specific bundle is obtained by choosing a par- 
ticular gauge which means that above every point of M, 
you choose a particular vector making sure that the over- 
all configuration is smooth. This is appropriately called 
choosing a (cross) section of the bundle. This leads to 
for example to the configurations depicted in Figure 1.2.40 
of a number of smooth closed ribbons. The configuration 
on the left represents the constant phase 6(@) = 0, corre- 
sponding to the connection A = 0 of the trivial bundle. The 
other phase configurations are smooth deformations that 
correspond to gauge transformations. So all three repre- 
sent the same physical situation in different gauges. They 
are gauge equivalent configurations. It clearly demon- 
strates the local character of the gauge transformations, 
because at any point of the base manifold we can choose 
a different rotation, as long as the overall deformations cor- 
respond to smoothly ‘wiggling’ the configuration. 


gauge transformations 


Figure 1.2.40: Gauge equivalence. Here we have depicted 
gauge equivalent configurations of the charge-phase factor 
e'™B() on a circle. On the left we have the trivial configura- 
tion B(@) = 0. Gauge transformations correspond to ‘wiggling’ 
the configuration. The configurations above are related by a 
periodic transformation A(@) so that p — B’ = B +A, with 
A(0) = A(2z). 


Gauge invariant characteristics. You may wonder if it is 
possible in reality to ‘drag’ a state vector along a closed 
loop like we described and whether the resulting phase 
change can be measured. The answer is yes and the 
fact is that what | have described is known as the Berry 
phase after the British mathematical physicist who discov- 
ered that it was possible to identify it in certain setups with 
time- or space-dependent Hamiltonians. The effect is also 
closely related to the much older Aharonov-Bohm effect 
as will be discuss in Chapter II.3. This interference pat- 
tern depends on the solid angle that the path H(A) has 
covered on the sphere.® Interestingly the Berry phase is 
apparently a purely geometric phase depending only on 
the geometry of the space of Hamiltonians. 


8The path is oriented and the orientation decides whether to take 
the solid angle w or 47 — w, which with equation (II.3.4) amounts to 
Rx (8) = Rx (—8) ë 
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Other gauge groups. We have in this subsection exhib- 
ited the structure of a principle fiber bundle and the vector 
bundles associated with its representations, but only for 
the rather modest example of the group U(1). This may 
have come across as a demonstration of how to crack 
a peanut with a sledgehammer. We want to stress that 
the fiber bundle description is quite universal as it is the 
framework in which most classical physics involving local 
symmetries can be cast. For example the Standard Model 
involves U(1), SU(2) and SU(3) gauge fields describing 
the electromagnetic, weak and strong interactions respec- 
tively. And the Grand Unified Theories (GUTs) that we will 
discuss in Chapter 1.4 have even larger gauge groups like 
SU(5) or SO(10) involving more gauge interactions. 


It means that the fields take a value in a representation 
space which is a vector space V = Rep which is typically 
C™ or R”. And the corresponding unitary representation 
of the group works in this space as a linear transformation 
(say, a rotation). The group can be any compact group 
like the groups of unitary or orthogonal (N x N) matrices 
denoted by SU(N) or SO(N) respectively. The label n 
on the field refers to the dimension of the vector space 
on which some irreducible (unitary) matrix representation 
of that gauge group acts. The Math Excursion on groups 
on page 635 of Part III gives a basic introduction to group 
theory. As we saw the group U(1) is special in that all 
representations are one-dimensional, meaning just phase 
factors. 


For the group SO(3) the unitary representations are la- 
beled with a semi-positive integer l, where the group is 
then represented by (21+ 1) x (21+ 1) matrices. This rep- 
resentation acts as a transformation group on a (21 + 1)- 
dimensional vector space. A field in this SO(3)) gauge the- 
ory will take a value in one of these vector spaces and is 
said to carry integer spin l. When the spin equals one we 
have the standard three-dimensional vector but one that 
lives in a Rep = R? internal space. 


For SU(3), the gauge group related to the strong interac- 
tions, the quarks and antiquarks transform as 3-dimensional 
representations (color triplets and anti-triplets), while the 
gluons form an 8-dimensional representation. This indeed 
means that SU(3), the group of 3 x 3 unitary matrices 
with a unit determinant, also has a representation by 8 x 8 
unitary matrices, which is irreducible, meaning that it can- 
not be reduced to a lower dimensional (for example three- 
dimensional) representation. This representation acts on 
the eight-dimensional vector field, which describes the glu- 
ons. A major achievement in mathematics has been that 
in the early twentieth century all these continuous groups 
and their representations were classified. The results have 
found a rich variety of applications in physics as we will 
show in Chapter 1.4 where we discuss the phenomenol- 
ogy of the ‘Standard Model. 


In the previous subsection we argued that to describe vec- 
tor fields on curved spaces one needs to introduce the so- 
called ‘tangent bundle’ of the manifold. This means that 
also general relativity can also be cast as a gauge theory 
where the local gauge group is the symmetry group of the 
local structure of space-time. Locally our space-time is flat 
Minkowski space-time with its translation and the Lorentz 
symmetries. The corresponding group is called the inho- 
mogeneous Lorentz or Poincaré group. The field strength 
in that case corresponds to the local curvature tensor R 
of the manifold and the connection would be the so-called 
metric connection w, that we introduced in equations 1.2.26 
and 1.2.24. It is gratifying to see that these phenomenolog- 
ically so different fundamental interactions that we have 
encountered in nature share this underlying structure of 
gauge invariance, mathematically represented by the con- 
cept of a fiber bundle. We must add the important fact 
though, that the physics itself resides in the field equations, 
being the Maxwell (more generally, the Yang-Mills) and 
Einstein equations. We return to the Yang-Mills equations 
in Chapter II.6 on symmetries and their breaking. The 
bundle picture makes the mathematical setting transparent 
and clarifies some of the physical features. E E 
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The physics of information: 
from bits to qubits 


It would appear that we have reached the limits 
of what is possible to achieve with computer tech- 
nology, although one should be careful with such 
statements, as they tend to sound pretty silly in 5 
years. 

John von Neumann (1951) 


Computation necessarily involves information storage and 
the manipulation of information on some underlying physi- 
cal substrate, so far mostly based on semiconductor tech- 
nology. Information is stored in the states of the system 
and one can manipulate the states by interacting physi- 
cally with that system. If the scaling down of basic com- 
ponents is to continue as is predicted by Moore's law, then 
entering the quantum domain is inevitable. So, there is a 
quantessence to information as well. This has profound 
consequences for how we should think about information 
and information processing. It turns out that quantum com- 
putation offers fundamentally different options for tackling 
certain classes of hard problems. 


A bit of information. Volumes are typically measured in 
liters, gallons, pints or cubic meters; and the unit chosen 
strongly depends on the local context. For information, 
however, this does not hold; it is universally measured in 
bits. This canonical character derives from the fact that 
the introduction of computers was right from the start a 
global affair. The ‘bit’ is the smallest unit of information and 
forms the basis for digital memories and data processing 
devices. One bit can be represented in many ways, for 
example like a switch that is on or off, or a single digit bi- 
nary number being either one or zero, or equivalently as 
a magnetic spin pointing either up or down, or a number 
that is either plus or minus one (see Figure 1.2.41). If | 
want to qualify for a discount on a public transportation 


Figure 1.2.41: The bit. Various representations of a bit of infor- 
mation. It is a two-state system such as a switch, a particle that 
can be in either of two states, or a classical spin that can point 
up or down. 


ticket for example, only one bit of information concerning 
my age will do. | only have to answer one yes-or-no ques- 
tion: are you younger or older than 65? In answering a 
single yes-or-no question you provide one bit of informa- 
tion. Generally quantitative thinking is based on working 
with variables that can be assigned numerical values; we 
attach numbers to them even though these may be only 
approximate. Those finite approximations can always be 
converted to finite base-2 or binary numbers, only contain- 
ing one’s and zero’s, and any calculations that you would 
like to do with the original numbers can also be performed 
in base-2. And we all know that such calculations can be 
extremely well and swiftly performed by today’s digital de- 
vices, at least if an efficient algorithm is available. 


Information and entropy 


State counting, entropy and information. In all informa- 
tion devices the information is carried by a physical sub- 
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Figure 1.2.42: Entropy and information. Counting the number 
of states of a digital memory. Shannon defined the information 
capacity of a system in bits as the logarithm of the number of 
states, therefore information is directly proportional to the notion 
of entropy in physics as defined by Boltzmann in the nineteenth 
century. 


strate representing a certain number of bits. The amount 
of information that can be stored in a physical system is de- 
termined by the number of distinct states that the system 
can be in. Let us think of a memory consisting of an array 
of little magnets that can point either up or down. Then we 
can count the number of states of such an array of bits as 
we did in Figure 1.2.42. For one bit we have 2 states, for 
two bits it is 2 x 2 = 4 states, and for n bits it is clearly 
2x2x... x2 = 2" states. This shows that there is a direct 
relationship between information capacity, i.e. the number 
of bits, and the number of states. This is an exponential 
relation, 


n bits & 2"states (exponential relation) . 


(1.2.38) 


This implies that the converse relationship between infor- 
mation capacity and the number of accessible states is a 


logarithmic one: ° 


# bits = log, (# states) (logarithmic relation) . 

(1.2.39) 
This relationship provides a precise and general quantita- 
tive definition of information that forms the very basis of 
information theory. The relation should remind you of the 
expression for the entropy S = klog W of a physical sys- 
tem, derived by Stefan Boltzmann, which links the entropy 
S as a state variable of a macroscopic system to the total 
number of distinct microscopic states W that correspond to 
that given macroscopic state, as we discussed in Chapter 
1.1 in connection with equation (1.1.62). So, entropy quan- 
tifies the microscopic diversity hidden in what we see as 
a single macroscopic state. In information theory, entropy 
is a measure for information capacity, the information that 
can be stored. 


Entropy and probability. At this point it is interesting to 
refine this relation between available states and informa- 
tion by explicitly introducing the notion of probability. In 
the previous derivation we have tacitly assumed that given 
a single macroscopic state, the probability of finding the 
system in any of the corresponding microscopic states is 
uniform. With N states that would mean that pi = 1/N 
because the total probability should add up to EN; =1. 
In thermodynamics this distribution would correspond to a 
closed system at fixed (conserved) energy, and where one 
assumes the equipartion of energy. 


?The information unit bit is linked to the logarithm base-2. If 
S = log2 N this means that 25 = N. Thinking binary means that you 
reason in base-2. If | say a number is 21 in base-10, | make the state- 
ment that that number equals 21 = 1 x 10° +2 x 10! = 1 +20 =21. 
If | say a number is 21 in base-2, that statement makes no sense 
because the symbol ‘2’ isn’t there. To convert the number 21 in 
base-10 to base-2, | have to expand the number in powers of 2, so, 
21 = 16+44+1=1x2*4+0x2?41x274+0x2'+1x2° & 10101. 
In base-10 the digits run from zero to 9, whereas in base-2 you only 
have 0 and 1. So the number 1011 in base two equals 1+2+8 = 11 in 
base 10. This way, all integers can be uniquely encoded in any integer- 
based number system. 
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In general though, the probabilities will not be equal and 
one should introduce a probability distribution {p;} over the 
microscopic states, as we did in the section on statisti- 
cal thermodynamics in Chapter |.1. There for a system 
in thermal equilibrium at temperature T, we introduced a 
probability pı which was dependent on the energy £; of the 
microsystem labeled by ʻi? For that case we showed that 
the expression (1.1.69) for the thermodynamic entropy was 
first given by Gibbs, and now we see that it corresponds to 
an information entropy or information capacity (in bits) of 
the system given by the fundamental expression: 


S= —}; Pi log, Pi. (1.2.40) 


We mentioned already that the entropy of the system, as 
defined above, is equivalent to its information carrying ca- 
pacity as it was defined by Claude Shannon. While work- 
ing at Bell Labs he published in 1948 a groundbreaking 
paper on the transmission of information, that by many is 
considered to be the birth of information science. The im- 
portant contribution from our point of view is firstly that he 
proved that it was the unique solution that satisfied some 
general constraints on information, and secondly that it ap- 
plied in a general context that transcended its physical ori- 
gins as thermodynamic entropy. So, that is where the term 
information entropy originated from. 


Let us see what happens if we apply the formula to the two- 
spin situation where we have a set of four states which we 
denote as {11, 10,01,00}. We may turn on a weak mag- 
netic field so that, say, the state 11 with both spins up is 
energetically preferred, for example leading to a distribu- 
tion: {pi = 1/2, p10 Pol Poo 1/6}. Then the 
corresponding information capacity would be S = (1 + 
log, 6) = 1.79 bits, which is clearly smaller than the 
uniform case with all pi = 1/4, yielding S = 2 bits. 
The point | want to make here is that the uniform dis- 
tribution is the maximally unbiased distribution, and it is 
that distribution which maximizes the information entropy, 
precisely because there is no additional constraint on, or 
in other words, ‘additional knowledge’ about, the system. 


Adding a priori knowledge reduces the information con- 
tent, or the amount of surprise the outcome of measure- 
ments could provide. Constraints always reduce the num- 
ber of allowed states for the system and therefore lower 
the entropy. 


The Landauer principle. Talking about the relation be- 
tween information and physical entropy it may be appropri- 
ate to briefly mention the principle proposed by Rolf Lan- 
dauer in 1961, which is a particular formulation of the sec- 
ond law of thermodynamics which directly applies to infor- 
mation theory and computation. The principle expresses 
the fact that erasing information necessarily involves pro- 
ducing heat, thereby increasing the entropy. So, in other 
words, the principle governs the intimate relationship be- 
tween information processing and the production of heat. 
This is of great importance, and it explains why large server 
parks tend to move up further north to colder environments. 
The heat produced by computers can certainly be reduced, 
but the improvements are bounded by the second law of 
thermodynamics. 


We have illustrated the principle in Figure 1.2.43. Consider 
a ‘gas’ consisting of a single atom in a symmetric container 
with volume 2V in contact with a heat bath. We imagine 
that the position of the particle acts as a memory with one 
bit of information, corresponding to whether the atom is on 
the left or on the right. 


Erasing the information amounts to resetting the device 
to the ‘reference’ state |1) independent of the initial state, 
and therefore reinitializing the system rather than making 
a measurement. This can be done by first opening the di- 
aphragm in the middle, then moving the piston from the 
right in, and finally closing the diaphragm and moving the 
piston back. In the first step the gas expands freely to 
twice the volume. The particle doesn’t do any work, the en- 
ergy is conserved, and therefore no heat will be absorbed 
from the reservoir. For that reason this is an irreversible 
free expansion process by which the entropy S of the gas 


106 CHAPTER 1.2. THE AGE OF GEOMETRY, INFORMATION AND QUANTUM 


|. 

11> | | => obe 
| 

10> 


i m 


Figure 1.2.43: The Landauer principle. An illustration of the 
Landauer principle using a simple ‘thermodynamical system’ 
consisting of a single particle in a vessel. See text for expla- 
nation of the successive steps. 


increases by a factor k(In(2V) — nV) = kln(2V/V) = 
kIn2. (The number of states the particle can be in is just 
the volume; the average velocity is conserved because of 
the contact with the thermal bath and will not contribute to 
the change in entropy). In the second part of the erasure 
procedure we bring the system back to a state which has 
the same entropy as the initial state. We do this through a 
quasi-static (i.e. reversible) isothermal process at temper- 
ature T . During the compression the entropy decreases by 
k In 2. This change of entropy is nothing but the amount 
of heat delivered by the gas to the reservoir divided by 
the temperature, i.e. AS = AQ/T. Therefore the heat 
produced AQ equals the net amount of work W that has 
been done in the cycle by moving the piston during the 
compression. The conclusion is that during the erasure of 
one bit of information the device had to produce at least 
AQ = TAS = kT In2 of heat. This argument shows 
that actually the heat computers generate is a necessary 
byproduct of them destroying information. It directly links 
the destruction of logical information with the thermody- 


namical generation of heat. This is a powerful result as 
it holds independent of the specific device one is talking 
about. 


To summarize, you could say that ‘forgetting’ has its price 
(in heat). And that raises an interesting question about 
computation in general: can one avoid the heat by do- 
ing computation reversibly? The answer to this question 
was given by Charles Bennet in 1982, and is affirmative. 
However, reversible computation necessarily employs re- 
versible gates only, but the familiar AND and OR gates (to 
be discussed shortly) are not reversible because they re- 
duce a two-bit input to a one-bit output, producing at least 
kT In2 units of heat upon acting. A reversible computer 
doesn’t pay the price of heat, but as all information has 
to be stored, the price of reversible computation is the re- 
quirement of ever-expanding memories! Not so cheap ei- 
ther. 


Models of computation 


Computing is normally done by [a person] writing 
symbols on paper. [...] | assume that the cal- 
culation is carried out on one-dimensional paper, 
i.e., on a tape divided into squares. | shall also 
suppose that the number of symbols [...] is finite. 
[...] The behaviour of the computer at any moment 
is determined by the symbols which he is observ- 
ing, and his ‘state of mind. [...] We may suppose 
[...] the number of states of mind which need to 
be taken into account is finite. ...the use of more 
complicated states of mind can be avoided by writ- 
ing more symbols on the tape [...] Every [simple] 
operation consists of some change in the physical 
system consisting of the computer and his tape. 

Alan Turing, 
On Computable Numbers with an Application to 
the Entscheidungsproblem, Proc. Lond. Math. Soc. 
2: 42. (1937) 
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Turing machines. Armed with a precise and operational 
definition of what information is, we should spend some 
time on computation or the processing of information. What 
are the basic underlying principles upon which the opera- 
tion of all our computational devices is based? 


We distinguish an input fed to a ‘machine’ that somehow 
processes that input leading to the desired result. To achie- 
ve this, the computer follows a sequence of instructions ac- 
cording to a certain procedure; an algorithm, or a program 
to produce the output. In a formal sense one could say that 
the device computes the output as a function of the input. 
As we have seen one can always present information in a 
binary way as a sequence of zeros and ones. So comput- 
ers basically evaluate a function of the input, correspond- 
ing to the output. And a basic question concerning com- 
putation is to model this process in its full generality and 
determine what kind of functions can be calculated. 


This is where the notion of a Turing machine comes in, 
which is a formal device satisfying certain specifications 
that can execute computations in the sense that it takes 
input and produces the desired output. It is not a machine 
in the ordinary sense but rather a fundamental model of 
computation. It does not address the question of the possi- 
ble physical implementation of the models, of how to make 
them into a real machines. It cannot care less whether you 
build it with rods and wheels, or like a fluid system with 
pipes and valves, or with Lego, or with elementary elec- 
tronic semiconductor components called transistors. 


Turing’s starting point was in fact a rather natural and intu- 
itive one based on the notion of an effective computation. 
A computation, procedure, or algorithm is called ‘effective’ 
if it satisfies the following criteria: 


(i) itis specified in terms of a finite number of exact instruc- 
tions, 

(ii) if the instructions are carried out without errors, the de- 
sired result is obtained in a finite number of steps, 


Allowed transitions in time step t -> t+1 


either: 


Figure 1.2.44: Turing machine transitions. The four possible 
actions of the R/W head in a transition: (r) if it writes it cannot 
move in the same step, and the state may either change or not; 
(I) if it does not write it can move at most one step either to the 
left or to the right, but cannot change the state . 


(iii) the instructions could in principle be carried out by a 
person only using using paper and pencil, 

(iv) this person does not need any particular insight or in- 
genuity to carry out the instructions. 


Note that there is no restriction on the amount of paper 
(memory), nor on the time it might take to perform the 
computation, apart from it being finite. The computation 
is ‘effective’ but not necessarily ‘efficient. 


The Turing machine can in principle perform any such ‘ef- 
fective computation’ and is defined as follows: 


(i) it has a (half)infinite tape containing cells labeled by an 
integer p, each cell contains a symbol « taken from an 
alphabet A . In the following we will just take the alphabet 
to be {0,1}, meaning that the tape is just a binary string 
which has a non-trivial input that starts on the left and may 
end with only zeros on the right. 
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(ii) a read/write head which is positioned at a given cell 
where it can read and (re)write the tape if instructed to do 
so. There are restrictions on what the head at any stage 
can do. 

(iii) At any given time the machine is in some definite inter- 
nal state S; which is an element of some finite state space 
Fs . The program or algorithm corresponds to a table that 
precisely specifies for every state what transition it has to 
make if the head reads either a zero or a one. This instruc- 
tion specifies (a) what the head has to do, and (b) to what 
state the machine is supposed to go. 

(iv) At the start of the computation, the input is the binary 
string on the tape. The head is located at the p = 0 cell 
and the machine is in the internal state Sọ. The program 
halts if it reaches a final state (the output) where it finds no 
further executable instructions. So this is how a Turing ma- 
chine computes a binary output function from some binary 
input. 


From the fact that for any effective computation there is a 
Turing machine, one can prove the existence of a univer- 
sal Turing machine that can perform all effective compu- 
tations. This machine defines the set of Turing-calculable 
functions. 


This rather intuitive definition of Turing-computability is the 
subject of the Church-Turing thesis which is central in the 
theory of computation. The Church-Turing thesis states 
that Turing computability is equivalent to the much more 
formal definition of computability based on recursive func- 
tions and Abacus machines. We are not going to dwell on 
these topics as they are really outside the scope of this 
book. The thesis cannot be proven as it links formal to 
intuitive notions. It is actually a hypothesis and all that 
can be said is that no counter example has been found 
so far. 


At this point it is probably helpful to describe a basic ver- 
sion of the machine in some detail. In Figure 1.2.44, we 
have the computer in some state S; and we show the tape 


Figure 1.2.45: Turing machine state diagram. The state dia- 
gram for the digital adder described in the text. In this program 
the machine goes through four states before it halts; in each 
state the move or write instructions on what to do if the head 
reads a 0 or a 1 are indicated. 


with the R/W head at some position p; . The program tells 
the head what to do but the possibilities are very restricted. 
There are only four possible transitions for the head/ma- 
chine to execute: 


(i) it stays at position p and does not change the entry with 
Sj = Sj ; 

(ii) it stays at position p and does change the entry on the 
tape, in which case it also may or may not change the state 
of the system Sj — (S; or Sx), 

(iii) it moves to the right (p — p + 1) with S; > S; , 

(iv) it moves to the left (p — p — 1) with Sj > S; . 


The permitted transitions are schematically depicted in Fig- 
ure 1.2.44. 


A Turing machine can also be represented by a finite state 
diagram. This diagram is a directed network where the 
nodes are the states S; and the directed edges represent 
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the instructions. Instructions where the state does not 
change correspond to lines returning to the same state. 
The number of arrows leaving the node equals the num- 
ber of symbols in the alphabet (in our case there are only 
two). 


In Figure 1.2.45 we have depicted the state diagram corre- 
sponding to a program that can add two positive integers 
m and n . We should think of the input as the two numbers 
in unial coding (this means that a number k is represented 
by a sequence of k + 1 symbols 1) separated by a 0, with 
also zeros on the left and on the right. So the input se- 
quence on the tape would look like: 


000[11...11]m+10[11...11]n+1000. 


The head should then walk along the string of symbols 
starting from the most left 1, and then moves to the right 
till it hits the in-between 0 , changing that 0 into a 1 , so that 
the sequence then looks like: 


000[11...11111...11]n}m+3000. 


Next the head should move to the left till it hits the first 0 
on the left, then moves right again changing the first two 1 
symbols into 0’s. The result yields the required sequence 
representing the desired outcome. 


000[11...11111...1 1} min+1000. 


You may verify that this sequence of steps is indeed per- 
formed by the machine depicted in Figure 1.2.45, by follow- 
ing the sequence step by step. 


We see that this simple problem already needs a quite 
complicated diagram. It is therefore more convenient to 
work in terms of logical gates, to which we now turn. 


Logical gates. A computation is formally the calculation of 
a function f of many binary variables, so f(a;, a2,...dn) = 
b. The circuit for f should after entering an input of any set 
of a values return a binary number b . In practice one starts 


AND OR XOR 
NOT ojoo oļlolo oļolo 
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1/0 oo EE ES 
ENEE WA AO 


Figure 1.2.46: Logical gates. The one-bit NOT gate, some two- 
bit gates, and their logical tables. 


with a universal set of simple logical gates that compute 
certain basic functions. By combining many of those in 
specific parallel and serial arrangements, arbitrarily com- 
plicated functions can be composed. Diagrams with logi- 
cal gates are simpler and more practical than going all the 
way back to the underlying Turing state diagrams. 


The basic gates typically have only one- or two-bit inputs 
and a one-bit output, like: 


(i) the NOT gate inverting the value of a single bit, meaning 
that if the bit contains a 1, then it is changed to a 0 and vice 
versa; 

(ii) the OR and the AND gate. These are 2-bit gates, they 
are irreversible because they reduce the information of the 
2-bit input to a 1-bit output. 


One may prove that the set of these three gates is uni- 
versal, in that they allow you to make machines to per- 
form all the effective computations as defined by Turing. 
There are many other gates possible and these may be 
preferred depending on the problem one wants to solve, 
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Figure 1.2.47: Multiplication. A multiplication for two 2-bit num- 
bers ‘by hand’ and the corresponding digital multiplier schematic 
composed of six AND gates and two XOR gates. The two 
lines of multiplication are performed in parallel, and the subse- 
quent additions are sequential, so one therefore needs a total of 
2n — 1 = 3 time steps. 


for example the exclusive OR gate also called XOR gate. 
It returns zero if both input bits are equal and one if they 
are different. These simple gates compute simple binary 
functions which can be represented in so-called truth ta- 
bles where the possible input values (the arguments) are 
given on the left whereas the function value appears on the 
right. For the basic gates these tables are given explicitly 
in Figure 1.2.46. 


In Figure 1.2.47 we demonstrate for example how to mul- 
tiply two 2-bit numbers. We calculate 3 x 3 = 9, which 
in binary terms reads 11 x 11 = 1001. On the left we 
show how this is done by ‘hand’ with pencil and paper, and 
on the right how it is done by a logical device consisting 
of some AND and XOR gates. Using the truth tables it 
is quite straightforward to follow the lines and put the bit- 
values on them, and convince oneself that it indeed works 


Going quantum 


Until recently, most people thought of quantum me- 
chanics in terms of the uncertainty principle and 
unavoidable limitations on measurement. .. The ap- 
preciation of the positive application of quantum ef- 
fects to information processing grew slowly. 
Nicolas Gisin 


Once we have come to appreciate the basic fact that infor- 
mation capacity is directly related to the ‘number’ of avail- 
able states of a system, it is immediately clear that if we are 
to descend to the level of quantum mechanics, we have to 
think in terms of quantum states. As we will see, quan- 
tum states are quantessentially different from their classi- 
cal precursors, and therefore we should be prepared to go 
back to the drawing board and define from scratch what 
we mean by information. The space of states has a com- 
pletely different structure indeed, and that forced the scien- 
tists to start developing what is nowadays called quantum 
information theory. 


It is in that way that a turning point in our understanding 
of what matter really is on the microscopic level induced 
a radical change in our basic notion of information. It was 
the eminent physicist Richard Feynman who maybe for the 
first time pointed out some of the basic principles in a well- 
known paper entitled There is plenty of room at the bottom. 
The change did not just affect the abstract, software side of 
information theory, but also the hardware side. The crucial 
challenge is nowadays to develop new types of quantum 
technology that allow us to store and manipulate quantum 
information. Without exaggeration one may say that this 
constitutes a new holy grail for experimental physics and 
engineering. 


There are basically two reasons why information will go 
quantum. The first is that information science has to con- 
front quantum physics at some point because of Moore’s 
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Figure 1.2.48: Moore's law. This law states the astonishing fact 
that over the last half a century the power of computing has dou- 
bled every 18 months. The continuous downscaling of the basic 
components forces us to enter the gates of quantum domain. 
(Source: High Tech Forum) 


law. The second is that scientists who looked more thor- 
oughly into the equations governing quantum information 
made the astounding discovery that for a number of tasks 
the quantum computer is extremely more powerful than its 
classical digital counterpart. 


Moore’s law. This is an empirical ‘law’ which refers to the 
spectacular fact that our computational power over the last 
half a century has increased at an incredible rate: on av- 
erage it has doubled every 18 months, as you can see in 
Figure 1.2.48. This implies that it has been growing expo- 
nentially for more than half a century!. We are now ata 
stage where a single active component of an integrated 
digital circuit has a size of about 10 nanometer, very small 
indeed. Once you realize that atoms are of the size of a 
nanometer, it is clear that Moore’s law has to break down if 
we don’t succeed in entering the quantum domain. In other 
words the continued scaling down in the size of the hard- 
ware components forces us to enter the quantum world 


RSA-2048 = 


251959084756578934940271832400483985 7142928212620 
4032027777137836043662020707595556264018525880784 
4069182906412495150821892985591491761845028084891 
2007284499268739280728777673597141834727026189637 
5014971824691165077613379859095 700097330459748808 
4284017974291006424586918171951187461215151726546 
3228221686998754918242243363725908514186546204357 

477 34; 82 i 


Figure 1.2.49: RSA-2048. RSA-2048 is a number with 2048 
binary and 617 decimal digits. The factorization has not been 
found yet. 


one way or another! 


A tough problem: integer factorization. But going quan- 
tum also means that we turn something that at first sight 
looks like a crisis into a tremendous opportunity. Quantum 
mechanics is so fundamentally different, that it would allow 
for a quantum computer to solve problems that would be 
intractable on our classical digital computers. 

A famous example is the factoring problem: | give you a 
very large integer N of n digits which | tell you can be writ- 
ten in a unique way as the product of two other integers 
Mo and M; . | don’t tell you what they are, but instead ask 
you to find Mp and M; . This turns out to be an extremely 
hard problem not only for people but also for very, very big 
computers. Hard in the sense of time needed to find the 
answer. Numbers of this type, that can be factorized into 
two prime factors are called RSA numbers and they have 
important applications in cryptography. 


That may surprise you but let us get a rough idea of why 
this is so. 
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A simple way to find the divisors is the method of ‘trial di- 
vision’ which goes back to the medieval Italian mathemati- 
cian Fibonaci. To know whether a number N has a divisor 
M you start with N and keep subtracting M until after k 
steps you get a number smaller than M, if that number 
happens to be zero then M is a divisor of N . You start do- 
ing this by choosing M = 2 and that takes care of all even 
divisors. Clearly the next number M we have to check for 
would be the next prime number but that requires that the 
list of primes is known. To get a rough estimate what we 
can do is to check divisibility for all odd divisors. One addi- 
tional observation that simplifies the search is the fact that 
if the two prime factors are unequal then one will be larger 
than VN and the other smaller. We thus have to check 
the divisor property only up to VN. Knowing that apart 
from the number 2 all prime numbers are odd we have to 
only search for odd divisors, which leads to a further re- 
ductions. An estimate for the maximum number of simple 
subtractions P* in such a worst case scheme would give: 


i 2%... 4 
P*(N) -VWN(5+) (am) (I.2.41a) 
k=1 
VN vVNH+H ] 
~ ¢ In2) 2"/2, (I.2.41b) 


In the last line we have assumed the number N to be a 
n-bit number, N ~ 2", and kept only the leading term in 
n. The key conclusion we draw from this rough estimate 
that the core time needed to factorize an n-bit RSA num- 
ber grows exponentially with n. It is no surprise then that 
children find factorizing to be much harder than multiplica- 
tion, and that is why in the pre-calculator-era they had to 
learn the multiplication tables (which are also factorization 
tables) from 1 to 20 by heart, like it concerned the first few 
couplets of a universal human anthem! And with comput- 
ers we do now the same thing, reading values from tables, 
whether they like it or not. A realistic example of such a 


RSA-768 = 
1230186684530117755130494958384962720772853569595 
3347921973224521517264005072636575187452021997864 
6938995647494277406384592519255732630345373154826 
8507917026122142913461670429214311602221240479274 
737794080665351419597459856902143413 


3347807169895689878604416984821269081770479498371 


ae 


92773781446799 
: JOTPSTOLEAS ddiaa 


Figure 1.2.50: RSA-768. RSA-768 is a number with 768 bi- 
nary and 232 decimal digits. The factorization given below was 
obtained through a heroic effort by an international collective of 
experts. It would have taken a powerful super-computer some 
2000 years, but they managed to do it in just two years. 


gigantic number is RSA-2048 shown in Figure 1.2.49, hav- 
ing 617 digital or 2048 binary digits. It is a public challenge 
to factorize it into two primes, and if you meet the chal- 
lenge you get US$ 200.000 — unfortunately the number 
of dollars does not come near N , nevertheless making it 
worth to give it a try! But wait is that true? We just calcu- 
lated that the amount of processor time would typically be 
t*(2048) ~ P*(n = 2048) x (107! sec) > 10° yr. this 
is a clear warning that you have to come up with a rather 
smart idea. 


An example of an integer number that — in a heroic effort by 
an impressive international collective of computer experts 
and mathematicians, using a tremendous amount of algo- 
rithmic ingenuity and digital power — has been successfully 
factorized in its two prime factors, is called RSA-768 with 
768 binary or 232 decimal digits. The result is displayed in 
Figure 1.2.50. 
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Quantum factorization. We concluded that with a classi- 
cal computer the typical time it takes to factor N in its prime 
factors grows exponentially with its size n , but the Ameri- 
can applied-mathematician Peter Shor proved in 1994 that 
with a quantum computer the job can be done in polyno- 
mial time. We will discuss the (quantum) algorithm he con- 
structed in more detail towards the end of Chapter II.4 in 
Volume Il. 


The factorization problem is in a strange way asymmetric: 
finding the integers Mọ and M; is kind of exponentially 
hard, but if you give me those integers, you and | can sim- 
ply check wether you are right by just multiplying them us- 
ing a large calculator, in a time of order t ~ n. Factoriza- 
tion is one of the main tools in cryptography, so it is not just 
a matter of academic interest. It is of prime interest to all 
those who are concerned about security and safe transac- 
tions via the internet, like banks (and their clients), medical 
services, intelligence agencies and twittering celebrities. 
In fact, with today’s world in a severe state of cybernation, 
all of us are highly dependent on a secure internet! 


To see the huge importance of exponential vs. polynomial 
scaling, suppose an elementary computational step takes 
At seconds. If the number of steps increases exponen- 
tially, factorizing a number with n-bit will take At 2°" sec- 
onds, where a is a constant that depends on the details 
of the algorithm. We have depicted some of the differ- 
ent computation time behaviors in Figure 1.2.51. The take- 
home message there is the huge qualitative disparity be- 
tween polynomial and exponential behavior that becomes 
manifest for large n. 


For example, if At = 10-° and a = 0.1, factoring a num- 
ber with n = 1,000 binary digits would roughly take 10°” 
seconds, which is much, much longer than the lifetime of 
the universe (which is a mere 4.6 x 10'” seconds). In con- 
trast, if the number of steps scales as the third power of 
the number of digits, the same computation takes a’At n? 
seconds, which with a’ = 107? is 10* seconds or alittle un- 
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Figure 1.2.51: Computational complexity. The classes P and 
NP refer to the growth of time needed to solve a problem of size 
n. Problems in P can be solved in polynomial time (t ~ n“ 
for some number «) and problems in NP cannot. These might 
grow exponentially (~ 2™) or super-exponential (like the factorial 
~ n! .) (Source: C. Moore, SFI) 


der three hours. Of course the constants a, a’ and At are 
implementation dependent, but because of the dramatic 
difference between exponential versus polynomial scaling 
for sufficiently large n, there is always a huge qualitative 
gap in speed that cannot be compensated for by adding 
more pieces of conventional hardware. 


| should add that for the factoring problem as such, the 
situation is in fact more subtle: at present the best avail- 
able classical algorithm does significantly better than expo- 
nential, it would require O ( exp(n'/? log’/* n)) operations, 
whereas an available quantum algorithm provided by Shor 
needs O(n? log(n) log(log n)) operations. To give you an 
impression we give a log-linear plot of the two factorization 
times in Figure 1.2.52, and you can see that the behavior 
for large n is qualitatively drastically different with slopes 
tending to 1/3 (classical) and zero (quantum). 


Factorization is only one of several problems that could 
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Classical algorithm 


Quantum algorithm 


Figure 1.2.52: Factorization algorithms. A log-linear plot of 
the estimated time it takes to factor an n digit number with the 
best available classical and quantum algorithms mentioned in 
the text. 


potentially benefit from quantum computing. The impli- 
cations of quantum information even go beyond quantum 
computing, and include diverse applications such as quan- 
tum cryptography and quantum communication, which by 
the way is intrinsically secure. 


The quantum leap to such mind-boggling speed-ups arises 
from two main sources. Firstly from the intrinsically paral- 
lel nature of quantum mechanics, which in turn is a conse- 
quence of a quantessential feature called the linear super- 
position principle. This parallelism basically derives from 
the fact that state vectors have many components, and a 
quantum interaction or operation or gate affects all com- 
ponents simultaneously. Secondly from the existence of 
so-called entangled states that are unique to quantum the- 
ory. Particles that are in an entangled state can be cor- 
related in a way which is not possible in classical phys- 
ics. We will talk in quite some detail about these quantes- 
sential notions in Volume II. The actual workings of quan- 
tum theory were apparently sufficiently subtle that it took 


many decades after the discovery of quantum mechan- 
ics before anyone realized that its computational poten- 
tial was fundamentally different and quite powerful indeed. 
The huge interest in quantum information and computation 
in recent years has caused a thorough re-examination of 
the concept of information contained in physical systems, 
spawning the field that is referred to as ‘quantum informat- 
ics. 


Computational complexity. One of the deeper issues in 
the theory of computation is to try and quantify what we 
mean by computational complexity. Roughly speaking a 
measure of the complexity of a problem would be the time 
it takes to solve the problem on a computer running an 
optimal program (algorithm) for that problem. The time it 
takes to multiply two n-digit numbers on a computer for 
example would naively grow quadratically with their size 
n, because you have to do of the order of n? basic multi- 
plications (plus order n additions). You can gain a factor n 
by parallelizing the algorithm: the multiplications giving the 
n ‘rows’ in the standard multiplication chart can be done in 
parallel, and the subsequent additions have to be done se- 
quentially, as indicated in Figure 1.2.47. The classification 
of complexity is now linked to the functional dependence 
of the computation time on n. 


There is a crucial distinction to be made here. Firstly, 
there are problems that can be solved in polynomial time, 
meaning that time is bounded by some simple power law 
t <n". Such a problem is by definition in the ‘polynomial’ 
class P, but one believes that there are many problems 
that do not belong to P and they belong to a larger set con- 
taining P as a subset denoted by NP . Note that NP does 
not just mean ‘not polynomial.’ The set NP contains prob- 
lems of the ‘find-the-needle-in-a-haystack’ type. These are 
hard to solve because you basically have to do an exhaus- 
tive search of the whole stack and that takes a hell of a 
lot of time. The distinguishing property for NP is that once 
you have found an answer it is straightforward to check that 
your answer is right or wrong. Easy, because a needle is a 
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Figure 1.2.53: Complexity classes. Hypothetical hierarchy of 
computational complexity classes and some standard problems 
belonging to them. Note that the integer factorization and graph 
isomorphism problem are classically believed to be not in P but 
in NP while in quantum informatics they belong to QP. (Source: 
M. Freedman et al.) 


needle, isn’t it? The formal statement is that the answer to 
an NP problem can be checked in polynomial time. 


The hardest problems in NP are called NP-complete. The 
NP complete problems are in an abstract way equivalent, 
meaning that they can be mapped onto each other in a 
one-to-one way. If you solve one, you have solved all of 
them. Integer factoring is believed to be in NP but not in 
P. Furthermore the problem is not considered to be NP- 
complete; it is believed to belong to an intermediate class. 
The complexity of complexity theory is that we do not a 
priori know that a super smart algorithm does not exist to 
factor large integers into their prime factors in polynomial 
time, but that we just have not been able to find the algo- 
rithm yet, nor have we found a formal proof that such an 
algorithm does not exist. We find ourselves in a serious 
catch-22 situation. Therefore one likes to say that certain 
problems are ‘believed’ to be NP-complete. 


P versus NP. Indeed, the million dollar question really is 
whether NP in the end is not just equal to P! Here we just 
have to wait for some real or arificial computer genius to 
strike. That question by the way is considered to be so 
fundamental, that it appears on the illustrious list of seven 
Millennium problems of the Clay Institute for Mathematics 
in the US, which were announced at a meeting in Paris, 
held on May 24, 2000 at the Collège de France. Just solve 
it and they will pay you that million dollars! 


Clearly the advent of quantum information theory calls for a 
new complexity classification scheme, with new categories 
denoted as QP and QNP. And therefore the complexity 
analysis becomes even more intricate. Whereas factoriza- 
tion is believed to be classically NP it is in quantum QP 
as we have indicated in Figure 1.2.53. Nevertheless, as 
things stand now, there is still a remote but dramatic pos- 
sibility that the content of this complexity picture in the end 
collapses to a single point! 


We will return to what a qubit, the fundamental building 
block of a quantum computer, exactly is, as well as to the 
basics of quantum communication in Part Il of the book. 
Quantum computation as a branch of science nowadays 
involves sophisticated and highly specialized subfields of 
experimental physics which are beyond the scope of this 
introductory book. We want to restrict ourselves to the 
quantessence after all. One quantessential conclusion we 
want to draw here is that information will go quantum not 
too long from now. Or, to quote Nelson Mandela: ‘It’s al- 
ways impossible until it's done’ 


Quantum physics: the laws of matter 


[The homeland] looked strange to us returned sol- 
diers. .. The civilians talked a foreign language. | 
found serious conversation with my parents all but 
impossible. 

Robert Graves, Goodbye to All That. 
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Understanding the deep structure of matter has led to a 
new conceptual basis for all of physics. A basis that gov- 
erns the laws of new fundamental particles and force fields 
but also of new phases of condensed matter, of chemistry 
and finally the laws of quantum information. 


Surprisingly, this section is the shortest of this chapter the 
reason is simply that we still have a whole book in front 
of us on the subject. Quantum theory has the names of 
many great scientists associated with it, and not just be- 
cause of the saying that success always has many par- 
ents. Roughly speaking one distinguishes three genera- 
tions of quantum physicists. The first generation consists 
of people like Max Planck who coined the idea that energy 
of heat radiation be quantized, Albert Einstein who, follow- 
ing Planck, postulated the existence of a particle of light, 
which he called a photon and explained the photoelectric 
effect using this new particle, and finally, Niels Bohr, the 
great Danish physicist whose model for the atom proved 
it to be a tremendous breakthrough. A second generation 
consists of great names like Erwin Schrödinger, Werner 
Heisenberg, Paul Dirac and others, who managed to give 
a mathematical foundation for the theory and derive its fun- 
damental equations. Many other luminaries like Wolfgang 
Pauli, Max Born, Enrico Fermi and John von Neumann 
greatly enhanced our understanding and interpretation of 
the theory (see Table B.1 on page 645 of Part III). 


After the Second World War a third generation took the 
stage, with the development of quantum field theory as the 
most outstanding fundamental contribution. Great physi- 
cists like Richard Feynman, Julian Schwinger and Sin-ltiro 
Tomonaga completed quantum electrodynamics shortly af- 
ter the war, and during the sixties and seventies a long list 
of distinguished scientists constructed the Standard Model 
of elementary particles and fundamental forces (see Ta- 
ble B.3 on page 647 of Part Ill). 


Parallel to these developments many new research direc- 
tions opened up such as quantum chemistry, quantum con- 


densed matter theory, quantum material science and quan- 
tum optics (see TableB.2 on page 646 of Part Ill). We 
would also like to mention the fundamental progress in our 
theoretical understanding of quantum principles that these 
three generations and generations after them have left us 
with. This book is of course completely devoted to these 
matters and we will discuss what the central ideas of quan- 
tum theory are and how counter-intuitive and therefore un- 
believable these ideas must have appeared at the time of 
their inception. You might experience some of that same 
uneasiness as you read along. As a matter of fact quan- 
tum physicists all around the globe have acquainted them- 
selves with the theory to such a degree that most of them 
have developed some kind of ‘quantum intuition. And yet, 
in spite of that they are still regularly taken by surprise with 
what nature is telling them. 


The development of quantum theory is one of the most 
astonishing achievements of twentieth century science to 
which a large number of gifted characters have contributed 
in the period of time encompassing the two world wars. It 
paved the way for a multitude of technological advances 
and even now we feel that the era of quantum technologies 
has only just started. This is exemplified by the promis- 
ing developments where quantessential principles are ex- 
ploited to create a totally new type of information science, 
involving quantum computing, quantum teleportation and 
quantum cryptography. Such is the power of truly new fun- 
damental insights in the workings of nature: what at first 
appears as pastimes for absent minded eggheads, ends 
up as core ingredients of radical innovations and new tech- 
nologies. Innovations that have offered new options for 
society, and often have deeply affected the human condi- 
tion. 


This book is quite voluminous, but that should not surprise 
you once you realize that — as is in full display in the tables 
at the end of the book — so many Nobel prizes have been 
awarded in this incredibly prolific field of science. 
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Addison Wesley (2000) 


General Relativity 
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ativity 

J.B. Hartle 
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On the physics of geometry: 


— Flatland: a Romance of Many Dimensions 
E.A. Abbott 
Penguin Group (2020) 


— The Geometry of Physics: An Introduction 
T. Frankel 
Cambridge University Press (2011) 


On the Physics of Information: 


— Introduction to the Theory of Computation 
Sipser 
Cengage India (2014) 


— The physics of information 
F.A. Bais and D. Farmer 
Chapter in Philosophy of Information 
P. Adriaans and J. van Benthem (Eds) 
Elsevier Publishers (2008) 


Chapter I.3 


Universal constants, scales and units 


Is man the measure of all things? 


Physicists have come to appreciate the existence of cer- 
tain universal constants of nature like the velocity of light, 
Newton’s constant, the elementary charge, Planck’s con- 
stant etc. These are numbers that cannot be calculated 
from first principles. They have to be obtained from mea- 
surements and their values set the scales that character- 
ize our universe. First we show how these constants can 
be used to define a complete and consistent system of 
units. In the second section, we take a step back and 
ask whether these constants are really universal, or just 
the parameters that appear in our theories and therefore 
only reflect the present state of science. In the third sec- 
tion, we play around with these constants to explore to 
what extent these natural scales mark the domains of va- 
lidity of particular theories. We conclude by describing the 
Planck system of ‘natural’ units and discuss its interpre- 
tation. Indeed, the arguments presented in this chapter 
suggest that man is not the measure of all things, rather 
the arguments constitute a modest plea to bid farewell to 
anthropocentrism. 


Isn't it a pity that we have lost many of those good old 
home and kitchen units, such as the thumb, the ell, or 
the foot, the knifepoint, the stone, the cloud, the crate, the 
walking hour, or horse power? The ‘foot’ is an example of 
where man was taken as the measure of all things; in fact 


Figure 1.3.1: The international prototype of the kilogram. Up 
to 2019, this was the standard of mass, kept under three glass 
bells in the Bureau International des Poids et Mesures in Paris. 
(Source: Wikimedia) 
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it was in the middle ages around 1100 that King Henry | 
decreed that his foot would be the unit of length. A nice il- 
lustration of the amusing fact that even only one man could 
be the measure of all things, with the clear disadvantage 
that that unit undoubtedly changed over time and further- 
more one may assume that he took this ‘standard foot’ with 
him in his grave. Since the era of the Enlightenment we 
have been ‘decimalized, as the powers of ten are natu- 
rally built into our common number system and our metric 
unit system, and they are now by far superior, not least be- 
cause they are pretty much shared globally. However as 
you know there are many remnants that don’t fit in. | am 
not just talking of astronomers or atomic physicists claim- 
ing that they only live 10713 parsecs or 10'° angstréms re- 
spectively from their work, because such jargon is presum- 
ably rather a measure of their ‘professional deformation,’ or 
should | say devotion? A brief history of time may clarify 
what | mean. 


On time 


lt’s about time. In spite of the globally accepted met- 
ric supremacy, there remains ample room for exceptions. 
Think of our units of time for example. As you know, these 
are mostly dictated by the dynamics of our solar system, 
with the year that refers to the earth’s rotation around the 
sun, while the month is set by the moon’s rotation around 
the earth and the day is fixed by our rotation around the 
axis of the earth. In fact the system of time divisions was 
primarily inspired by the geometry of the circle, which has 
360 degrees, approximately one degree per day. The cir- 
cle exactly encloses six adjacent equilateral triangles with 
all angles equaling 60 degrees, and when you cut this par- 
tition in half — which one can do with only ruler and protrac- 
tor — you would account for the division of a year in twelve 
months. The solar system’s periodic motions serve as a 
celestial clock, with the almost natural choice of 24 hours 
to the day. It is better to think of twelve hours for the day 


and twelve for the night, which is a division believed to go 
back to the Egyptians who did their arithmetic in base 12. 
From the hour down, the minute and the second are then 
counted in the base-60 numbering!, and below the second 
we talk mili- and nanoseconds and we unanimously con- 
vert to base-10 numbering. At opposite end of the scale 
we think also in powers of ten centuries and millennia. So, 
indeed, our common time units are quite archaic and con- 
voluted. 


Unifying the incommensurate. The numbers given to us 
by Mother Nature are far from accurate because they may 
vary. Moreover, they inhibit implementing the geometric 
precision we just alluded to, because there is no physi- 
cal reason why the units of year, month and day should 
have anything to do with each other as they refer to en- 
tirely different dynamics which are almost completely de- 
coupled. And that’s of course why the year is approx- 
imately 365.2422... days. To put it in perspective, it is 
like decreeing that from now on there are approximately 
9.893... cents to the dime and 9.734 dimes to the dollar! 
Such incommensurate units would lead to a lot of prob- 
lems at the check out, | am sure! 


To arrive at an orderly bookkeeping of time it took nothing 
less than a pope — Gregory XIII to be precise — to decree 
in 1582, much like a well-trained engineer, that we should 
make successive approximations. First we put 365 days 
in the year, but to make up for the other decimals we add 
one day — let’s pick the 29th of February — every four years, 
and call that a leap year. That brings us up to 365.25 days 
per year on average. Now the next step in our approxi- 
mation is made by skipping one leap year at the turn of 
the century, which brings the leap day contribution down 
by a factor 1/25 so we drive at 365.24 days per year on 
average. In the next step, we don’t skip every 400 years 


The base-60 or sexagesimal number system goes back to the 
Babylonians as far as about 3100 BC. They later even introduced a 
positional notation marking for empty places (like our zeros) to keep 
track of additional powers of 60. 
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which gives us a score of 365.2425.The subsequent cor- 
rections are accounted for in a rather ad hoc manner by 
the introduction of what are called leap seconds. 


We see the disparity and feel the tension between nature’s 
innate rhythm and the strictly rational recipes we would like 
to impose. It reminds us of that funny story of a governor 
of a southern state in the US, who thought he could ren- 
der his community a great service by decreeing that the 
number 7t from then on would be set equal to 3 in order 
to simplify life! But as the number 7t is defined as the ra- 
tio of the circumference to the diameter of a circle, there 
is not much room for decreeing anything about it. With 
the millennium debacle still fresh in our minds, when lots 
of computer software went haywire because of hardwired 
calendar settings which couldn’t handle the number 2000, 
we may have to anticipate future troubles simply because 
the trivial accounting of the Gregorian calendar has not 
been implemented correctly. 


It is amusing to learn that the decimal metric system, which 
goes back to the French Revolution, was also originally 
intended to cover the measurement of time. In 1793 ap- 
parently the French Republican Calendar was introduced, 
with weeks of 10 days, lasting 10 hours, with 100 minutes 
to the hour, and 100 seconds in one minute. This caused 
massive protests, not in the least by the church authori- 
ties, who felt they were losing influence and didn’t want 
to reshuffle their Holy days, which were shared anchor 
points for people’s sense of time. It was only in 1805 that 
Napoleon decided to abandon the system. 


The system of time units is, like our DNA, the outcome of 
a contingent sequence of improvements that for the case 
at hand co-evolved with us humans. Our common units of 
time unmistakably reflect the subsequent stages of human 
scientific awareness and technological advancement. 


Reinventing the meter 


An optimal system of units should be complete and consis- 
tent, but also precise. This implies that the most advanced 
measurement of the universal constants of nature, or com- 
binations thereof, have to be used to define units. Ac- 
cording to the Système International (SI) of units, it distin- 
guishes 7 base units and more than twenty derived units. 
The 7 (independent) base units are: the second (time), the 
meter (length), the kilogram (mass), the ampere (current), 
the kelvin (temperature) the mole (amount of substance) 
and the candela (luminosity). 


The measurements by which these units have to be de- 
fined should not only be precise, but should also be rela- 
tively easy to reproduce, so as to make it easier to share 
the system of units in a practical way. These criteria are 
ever more relevant, as many of our daily activities depend 
on a great precision of measurement that makes our de- 
vices work, think for example of using the Global Position- 
ing System (GPS). These criteria also make it mandatory 
that the system of units has to be upgraded from time to 
time so as to take advantage of the newest scientific and 
technological advances, not unlike the operating systems 
of our computers. 


Let us return to our brief history of time, and see what hap- 
pened to the definition of the second as a unit of time in the 
course of time. We started with time units inspired by the 
heavenly mechanics and the observations thereof. It may 
surprise you, but indeed, up to 1960 the second was de- 
fined as ‘the fraction 1/86400 of the mean solar day. The 
exact definition of ‘mean solar day’ was left to astronomers. 
Apart from the fact that the rotation of the earth has irreg- 
ularities, the measure itself was ad hoc. In 1967, it was 
finally switched from an astronomical to an atomic time 
standard as it is both far more precise and much easier 
to reproduce. 
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length is of the order of a meter, their 

weight is in the range of kilograms, 
their hart ticks at the rate of seconds we should 
not be surprised that we have ended up with some- 
thing nice like the metric SI (Systeme International 
d'unités) — or MKS (Meter-kilogram-second) system 
as the measure of measures. And with it come the 
prefixes, the formal powers of ten, from picosec- 
onds to terabytes and beyond.This metric thinking 
suggests that scientists have lifted their quantitative 
thinking entirely to the rational norm. 
But alas, it is exactly in their ranks that irrational 
alternatives flourish. Extensive use is made of de- 
rived units that pay tribute to their ancestors and 
perhaps — who knows — one day to themselves. Ex- 
perts thus actively employ the Newton, Joule, Pas- 
cal, Coulomb, Watt, Farad, Ångström, Tesla, Gray, 
Henry, Fermi, Ohm, Siemens, Weber, Hertz, Oer- 
sted, Becquerel, Rydberg, Curie, Fahrenheit, Rönt- 
gen, Stokes, Millikan, Gray, Sievert, and whatnot. 
What’s in a name, you may wonder. However, note 
that we should have typeset these names in lower 
case, to avoid any suggestion that they might re- 
fer to individuals. After all, the force of 3 Newtons 
is quite something else than 3 newton. If only we 
could have 3 Newtons! It reminds me of the dis- 
claimers made in the preface of some classic nov- 
els: ‘all similarities with persons alive or dead are 
purely accidental. 
Count your blessings though: in the nineteenth cen- 
tury, just to communicate about temperatures, one 
had to convert between a rich variety of what | 
would like to call tribal scales. Not only the fa- 
miliar the degrees Fahrenheit, Celsius and Kelvin, 
but also degrees Réaumur, Ramer, Rankine and 
Wedgewood! Fortunately there is only one nature, 


=, ŅÑ When the Saints go marching in... 
se = Given the plain fact that the human 
/ ai 


meaning that whatever units you happen to invent, 
they always can be converted to more sensible 
ones. So, referring to obscure units is more a mat- 
ter of name-dropping highly-esteemed colleagues, 
than using double standards. O 


Today the second is defined as: 


the duration of exactly 9192631770 periods of 
the radiation corresponding to the transition be- 
tween two hyperfine levels of the ground state 
of the Caesium-133 atom. 


This we may write as an exact defining equation: 
vcs = 9, 192,631,770 s7! . (1.3.1) 


This definition of the second refers to the frequency asso- 
ciated with the radiation that is transmitted if the Caesium 
atom makes the transition between two well-defined quan- 
tized energy (hyperfine) sub-levels. You could say that a 
Caesium clock gives about 9.2 billion ticks a second. That 
quantity can be measured with great precision, meaning 
that if you compare the outcomes of a great many carefully 
performed measurements, the spread of outcomes will be 
extremely small. In other words it is the spread of these 
measurements which determine the number of significant 
(reliable) digits. By defining the second as a fixed number 
times a physical observable, the number of significant dig- 
its in the definition of the unit equals that of the best possi- 
ble measurements. The central point here is that the units 
inherit the precision of the measurements and they there- 
fore necessarily co-evolve with the state of the art in ex- 
perimental physics, without the need to redefine the units 
all the time. 


You may not be surprised to hear that at present physicists 
are in the process of developing devices which will allow us 
to define the unit of time by a factor 100,000 times more 
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precise’, by using so-called femto second lasers, that de- 
liver tiny pulses about 10'° per second. This technique 
uses a so-called frequency comb, produced by a pair of 
frequency locked optical lasers. It is a quantum optical de- 
vice for which the Nobel prize was awarded in 2005 to the 
American physicist John Hall and his German colleague 
Theodor Hansch. Indeed the definition of time is getting 
outdated all the time and a switch to the new quantum op- 
tical standard is to be expected in ten years’ time. 


In quantum theory many observable quantities like energy 
levels, currents, fluxes, charges and so on turn out to be 
quantized, meaning that they only can take on discrete val- 
ues, exactly equal to integer multiples of certain combina- 
tions of universal constants. This ‘quantization’ property 
allows them to be measured with extreme precision and 
that makes them particularly suitable for defining units. We 
should devise definitions for a set of base units linked to 
the universal constants of nature so that we can measure 
the best, and then use those to define the other derived 
units. 


Also in that vein the unit of length, the meter, was redefined 
in 1983 as: 


the distance traveled by light in vacuum in ex- 
actly 1/299792458 of a second. 


Another way to say it would be to state that, 


c = 299792458 m/s, (1.3.2) 


again exactly, no decimals to be added! This definition to- 
gether with the definition of the second then defines the 
meter. We need no longer refer to the International Proto- 
type Meter kept at the Bureau International des Poids et 
Measures in Paris, as the distance between two marks on 
a Platinum-lridium bar that was kept at the freezing tem- 
perature of water. 


Now, it may come as a surprise to you that the definition 


of the kilogram as the unit of mass was up to 2019 linked 
to an artefact, the /nternational platinum-iridium kilogram 
kept at the aforementioned Bureau in Paris, and shown in 
Figure 1.3.1. It comes across as indeed somewhat archaic, 
and fortunately this artefact has been replaced by a more 
adequate and operational definition involving Planck’s con- 
stant, again referring to precise measurements of quantum 
behavior. 


The definition of ampère also used to be somewhat cum- 
bersome and hard to implement. It was defined as: 


the constant current which, if maintained in two 
straight parallel conductors of infinite length, 
of negligible circular cross-section, and placed 
1 meter apart in vacuum, would produce be- 
tween these conductors a force equal to 2 x 
1077 newton per meter of length. 


Imagine entering the store and asking for two infinite wires 
of zero cross section: ‘Oh yes, Sir, uh, let me see, oh no, 
its not in the catalogue. | am really sorry Sir. And by the 
way, Sir, may | ask also you to be so kind as to leave my 
store immediately please: 


As to the notion of temperature, the definitions were linked 
to phase transitions in matter systems, as for example the 
Celsius degree which was defined as 1/100 of the tem- 
perature difference between the boiling and freezing tem- 
peratures of water under ‘normal’ conditions. Since 1954, 
the kelvin has been defined as exactly equal to the frac- 
tion 1/273.16 of the thermodynamic temperature of the 
triple point of water, which is the point at which water, 
ice and water vapor co-exist in equilibrium. That is a very 
useful definition because for water at a specific pressure, 
the triple point always occurs at exactly a temperature of 
273.16 K. Yet also there it was agreed to couple the defi- 
nition with a universal constant — the Boltzmann constant 
k — which links energy and temperature according to the 
formula E = kT . The new 2019 definition reads: 
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The kelvin, symbol K, is the SI unit of ther- 
modynamic temperature; its magnitude is set 
by fixing the numerical value of the Boltzmann 
constant to be equal to exactly 1.380649 x 10-78 
J/K [joules per kelvin]. 


As a matter of fact the most accurate measurements of k 
(about one part in a million) have been obtained by acous- 
tic thermometry, which relies on the fact that the speed 
of sound in a gas is directly dependent on its tempera- 
ture. 


Can we change? Yes, we can! What has happened 
over the last half-century is that we have been replacing 
units defined by certain sacred artefacts kept in highly- 
esteemed institutions, with units based on precision mea- 
surements of certain universal constants or combinations 
thereof. 


The diagram depicted in Figure 1.3.2 gives a comprehen- 
sive scheme of the newly proposed definitions of the base 
SI units. The proposal was prepared by the Comité inter- 
national des Poids et Mesures and was officially adopted 
in 2019. This is quite a substantial upgrade, much like the 
upgrades of your computer software, except that | would 
guess that here we talk about version 26 or so, because 
the first versions go back to about 1875. The base units 
are represented as colored nodes, and the fundamental 
constants of nature used to define them correspond to 
the surrounding brown nodes. The grey arrows indicate 
how the definitions are hierarchically linked to each other. 
There are seven fundamental units, and therefore seven 
constants are needed to fix them. The proposal is inter- 
esting in that these seven constants are given exact values 
when expressed in the base units, and therefore this guar- 
antees a consistent set of definitions if we follow the arrows 
in the appropriate way. To understand how a unit is defined 
you look at the arrows coming in to the corresponding node 
and see where they come from. One arrow comes from an 
adjacent constant of nature and possible others come from 


Figure 1.3.2: New Sl-Units The update of the definition of the 
base SI units adopted by the Comité international des Poids et 
Mesures in 2019. The brown nodes represent integers defining 
the constants and the arrows indicate the dependencies in the 
definitions. You start by defining the second in terms of the fre- 
quency of the ground state Caesium hyperfine transition. Then 
you move on to the meter, the kilogram and the ampère, all of 
which involve one additional constant, and then you move on to 
the kelvin and candela. (Source: Emilio Pisantly, on Wikipedia.) 


units that have been defined before. 


Let us consider the definition of the ampère A. We start 
with the gray arrow coming from the elementary charge 
e, that arrow represents the exact value of e in terms of 
A: 


e = 1.602176634 x 10° "As. 


The other arrow comes from the ‘second, which is the unit 
we already defined in terms of the caesium frequency, and 
therefore A is defined in terms of the observed values of e 
and vcs. 


For the kilogram the prototype is not longer used, but refer- 
ence is now made to the exact value of Planck’s constant 


HOW UNIVERSAL IS UNIVERSAL? 


125 


h = 6.62607015 x 107% kg m? s7! , 


which now also involves the meter (referring to c and sec- 
ond) and second (referring to vcs). So the definition of the 
kilogram relies on the measurement of the constants h, c 
and vcs. 


The remaining question is how, and in what combinations, 
are these constants determined experimentally. For exam- 
ple magnetic flux ® that pierces through a two-dimensional 
superconductor happens to be quantized directly in terms 
of fundamental constants: © = npo = nh/2e , from which 
the Josephson constant K; = 2e/h can be measured ex- 
tremely precisely. On the other hand, in a so-called quan- 
tum Hall system, the Hall-conductivity, which is a trans- 
verse conductivity, is quantized in units oy = ne?/h that 
allow for a precise determination of the Von Klitzing con- 
stant, Rx = h/e? . Measuring these two constants yields 
an accurate determination of the fundamental constants e 
and h. Another important observable defined in terms of 
fundamental constants, which can be measured very pre- 
cisely, is the fine structure constant x , 
1 e 
x= Amare (1.3.3) 
Indeed the choice of universal constants forms a fair re- 
flection of the depth and precision to which science has 
managed to descend, and the way they are used in the 
definition of SI units strikes the optimal balance between 
precision and reproducibility. 


How universal is universal? 


Universality is a beautiful, ambitious, but also vulnerable 
concept, because how do we know whether some con- 
stants of nature are universal or not? In mathematics we 
know such numbers exist, but in physics it is harder to 


define and establish universality. One should at least re- 
quire that such would-be universal numbers are the same 
throughout the (our) universe. But how do we know they 
are or are not? 


Universal constants are — if not God-given — at least Mother- 
Nature-given-numbers. They happen to be equal to what 
they have been found to be in human experiments. Their 
values are believed to be universal, that is, independent of 
space and time. As you know too well, that doesn’t hold for 
all Mother-Nature-given-numbers, like todays value of your 
body-mass index for instance, or the viscosity of some ex- 
pensive French Cognac. If | use phrases like ‘the same ev- 
erywhere and for all time, | in fact mean everywhere and 
for all time in our universe, or even better, just nearby in 
Our universe in our present age. Because if we happen to 
live in a multiverse — and there is no fundamental reason 
why not — then one of the clues about multiverses is that in 
each separate universe the laws of physics could be quite 
different. They would represent very different points in the 
space of possible theories that we have come up with so 
far. This would imply that there might be entirely different 
sets of universal constants or known constants could take 
different values. 


Fundamental constants as model parameters. A more 
pragmatic approach would be to postulate that the univer- 
sal constants are the numerical input parameters that ap- 
pear in our theories, such as the masses of elementary 
particles and the strengths of the fundamental forces. The 
latter, like Newton’s gravitational constant and the elec- 
tron charge, are also called coupling constants because 
they set the strength of the forces between particles car- 
rying mass and/or charge. The very fact that they ap- 
pear as input parameters means that they cannot be calcu- 
lated within that theory; their value can only be determined 
through experiment. And for all we know these numbers 
are completely independent. 


In mathematics we have universal numbers that are ab- 
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solute as they can be rigorously defined. The number 7t 
for example is defined as the ratio of the circumference 
of a circle and its diameter. It is a dimensionless number 
that cannot change and is absolute within the framework 
of mathematical axioms. One might be tempted to link the 
dimensionless ratios of physical universal constants to an 
expression in terms of the universal numbers of mathemat- 
ics only, much like Plato in his cave would have liked it. In 
spite of the fact that there is quite an industry actively pur- 
suing these ideas, | consider that somewhat premature. | 
can only envisage such a step as a final one where the 
ultimate unified physical theory would be obtained. But 
nobody promised us such a paradise in the first place so 
let’s go back to the parameters in our current fundamental 
physical theories. 


Reducing the number of fundamental constants. From 
the perspective of physics it makes complete sense to ask 
how fundamental these would-be fundamental constants 
really are. Over time, physical theories get more and more 
unified in their description of physical phenomena, imply- 
ing that fewer theories with a smaller number of parame- 
ters suffice to account for the same or an even larger body 
of experimental data. This means that the number of inde- 
pendent fundamental constants has to decrease because 
we discover relations among them. 


Think for example of Maxwell’s theory unifying the descrip- 
tion of electricity, magnetism and light into a single frame- 
work. That theory has in fact three fundamental constants 
(i) the dielectric constant of the vacuum eo featuring in 
the Coulomb law that gives the force between two electric 
charges (ii) the magnetic permeability of the vacuum uo 
featuring in Ampére’s law that gives the force between two 
current carrying wires and (iii) the velocity of light c . Now it 
turned out that there is a relation between these constants 
that follows on from Maxwell’s equations, that relation is 
just c = 1/,/€ouo , and it is this relation which allowed us 
to write the Maxwell equations (1.1.26), with only the ve- 
locity of light appearing in them. This is a nice illustration 


of the fact that the more unified the perspective, the lower 
the number of independent fundamental constants. This 
insight forces us to accept that our universal constants are 
not so universal after all, and it makes us wonder where 
this game will end. 


Where do we stand? Constants that at present are con- 
sidered to be universal are for example the strength of the 
gravitational and electric forces Gy , and e*/47t€9 , the ve- 
locity of light c , Plank’s constant h, and Boltzmann’s con- 
stant k. These constants are dimensionful; they are not 
pure numbers like 7 , because they have some units linked 
to them, like c has units length/time. That may disappoint 
you because we are talking about universal constants and 
they change already if we go from measuring lengths in 
meters to lengths in inches and the like. 


But the good news is that they, exactly because they have 
units, provide universal — Mother Nature given — links be- 
tween those different types of units. Such links allow you to 
eliminate specific units, for example we can use c to con- 
vert to units where spatial distance is measured in sec- 
onds, light seconds to be precise. A distance of one 
light second is defined as the distance a light pulse would 
travel in one second, so generally the distance d in meters 
corresponds to a distance d/c in light seconds. This 
is what we discussed extensively in the previous section. 
The sun is eight light minutes away while the Andromeda 
galaxy 2.5 million light years. Planck’s constant h appears 
in the fundamental relation linking energy and frequency 
postulated by Einstein reading E = hv, and has units 
joule x second, the velocity of light links mass and en- 
ergy (E = mc?) but also space and time as we saw. Boltz- 
mann’s constant links temperature to energy through the 
relation defining the thermal energy E = INKT. Having all 
these relations we could do away with all conversion fac- 
tors, meaning that you can choose units in which the uni- 
versal constants (h, c and k) would become equal to unity, 
and then measure everything in powers of only joules (en- 
ergy) or only meters (length) or only seconds (time). We 
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will come back to this system of ‘natural units’ shortly. 


Time dependence of fundamental constants? The com- 
ments made so far suggest that we take a more pragmatic 
stand on this question of universality. On a deeper level the 
value of many would-be universal constants could for ex- 
ample depend on some underlying, hitherto unknown dy- 
namical mechanism, which typically means that they are 
probably not constant in space and time. Instead they are 
like fruit or peanut butter, in that they have an expiration 
date. They turn from external input parameters of the old 
theory into calculable output parameters of the underly- 
ing new theory. They move from the pool ‘fundamental’ to 
the pool ‘effective’ But if this is the way it works it sug- 
gests that we should go out and measure whether there 
are universal constants that do actually vary in time and 
space. We know for example that the fine structure con- 
stant « = e?/4rħc sets the scale for the separation of lines 
in the atomic spectra, and one could try to make observa- 
tions of the spectra emitted from atoms that are very, very 
far away in the universe and check whether the fine struc- 
ture constant was exactly the same or different at the time 
the signal was emitted. Experiments of this nature were 
proposed by John Barrow et al. in 2002. The results of 
such experiments have so far not confirmed the idea but 
did produce some upper limit on the relative shift of œ of 
10717 per year in 2008. 


The narrow window of opportunity for life. It is the set 
of values that these constants of nature have, which turns 
out to be essential for our universe to be what it is. How do 
we know? Can we go to other universes to check this out? 
No, not quite, but having reliable theories in which these 
numbers feature allows us to ask what would have become 
of our universe if the parameters had had different values. 
The result of such an exercise is quite surprising not to say 
startling: it is only in a very narrow window of parameter 
values that a universe like ours, with its structural complex- 
ity and diversity as expressed through the chemistry of life 
for example, would be possible. We have touched upon 


What to do if somebody tells you 
that they weigh 10°? Hertz? If you be- 
friended a music lover and they tell you 
that their mass is 10°? Hertz (1 Hz = 
1 inverse second), then you might want to call them 
crazy, but if they know about universal constants 
what they say may make complete sense. You can 
always go back and restore the more familiar units 
by multiplying with a particular simple combination 
of fundamental constants. In this case you start with 
inverse seconds and want to get back to kilograms: 
M = 10°%|[second—'] = M x h [joule] = M x 
h x c7? [kg] . So, the upshot is that the combination 
he? converts [sec '] into [kg] . The numerical fac- 
tor involved equals 6 x 10-*4/9x 1016 = 0.66 x 1070 
[sec kg]. So having a mass of 10°? Hz is actually 
quite OK. Indeed, units are a matter of convention; if 
somebody on a market ordered 50 troy ounces of 
Gouda Cheese, you would not be surprised if | told 
you that this person was an English jeweler honey- 
mooning in Amsterdam, would you? O 


some of these aspects in the section about Big Bang cos- 
mology. And others will be mentioned in a section on the 
ascent of matter in Chapter III.1. 


Turning the argument around one could say that choos- 
ing the values of the universal constants at random, the 
chance to end up with an inhabitable universe would be 
vanishingly small. We expect universes equipped with fancy 
observers like ourselves to be extremely rare. Lucky us! 
The anthropic principle — a philosophical principle — refers 
exactly to the attempt to apply the arguments just pre- 
sented in the opposite order. It tries to derive the struc- 
ture of our actual universe solely from the fact that we, 
homo sapiens, are here. In a qualitative sense this is of 
course an interesting question, but as a quantitative ap- 
proach it strikes me as naive and doomed. Think of the 
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calculation from quantum first principles, of the anomalous 
magnetic moment of the electron, which agrees with ex- 
periment to twelve significant decimal places! It is hard 
to imagine getting such precision out of a qualitative ap- 
proach like the anthropic principle. To understand the uni- 
verse you need to use far more facts from nature than our 
mere existence. 


Theories outside their comfort zone 


Scientific progress can be measured by how effective our 
theories are. The more physics we explain with the fewer 
theories, the better. In this section we are going to play 
some heuristic games with numbers. The observed nu- 
merical values for our universal constants tell us what the 
relevant scales in nature are. At the same time these num- 
bers provide insight in the domains of validity of some of 
the well-established theories. Surprisingly, naive reason- 
ing and dimensional analysis leads to suggestive qualita- 
tive insights with respect to fundamental physics. These 
arguments underscore the value of heuristics. We have 
listed some of the fundamental scales with the formulas 
related to them in Table 1.3.2 on page 147. 


Domains of validity. Given the values of the universal 
constants, it is enlightening to cook up other numbers from 
them which in turn can be interpreted as characteristic 
scales that play a significant role in our universe. Such 
scales not only follow from the observed values, but also 
from assumptions underlying the theories in which they ap- 
pear as parameters. This number cooking game often in- 
volves extrapolating the ‘laws of nature’ to uncomfortable 
extremes and exactly for that reason this game can yield 
some information on what the domain of validity of such 
theories really is. 


Some devil’s advocate, a malign adversary or even a bright 
student may within the context of a certain model come up 


with some well-defined, yet, really nasty questions. Ques- 
tions, which the theory may fail to answer correctly, or may 
cause the theory to get stuck in a recursive loop that points 
to a profound confusion or persistent contradiction in our 
current understanding. Contradictions of a type that faith- 
ful teachers sometimes hide, ignore, or even deny. Yet, 
there always appears to be a moment of truth when it is 
no longer possibly to deny that the theory fails to give a 
straight answer to a straightforward question, not even in 
principle. That is why such Q&A sessions are worth pursu- 
ing in spite of their heuristic if not speculative nature. For- 
tunately many of the theorists | met in my life were always 
willing and — even eager — to randomly ‘shoot the breeze’ 
and ask creative ‘what if’ questions. 


This freedom to let the collective mind wander should be 
cherished as it is at the heart of scientific progress. And 
scientific progress is basically about pushing the limit on 
the ranges of the validity of theories further and further. 
After each turning point or paradigm shift, the new theory 
usually provides clear-cut quantitative restrictions on the 
domain of validity of the old theory; that is why we can 
speak of scientific progress in the first place?. 


The virtue of heuristics 


All we need is the back of an envelope. 


Do electrons love or hate each other? We have so 
far discussed some aspects of the classical theories and 
some of the salient features of the relativity and quantum 
domains. And we have commented on the universal con- 
stants of nature that we have measured and that feature as 
external input parameters in our models, like the strength 


? Some devil’s advocates therefore argue that particular religions, as 
systems of knowledge, lack an internal mechanism or stimulus through 
which they might learn about their limited domain of validity. It is my 
opinion that the imperative of open questioning and self-improvement 
sets science apart in the history of human endeavors. 
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Fe Me, FG Fe 


Figure 1.3.3: Interacting electrons. Two electrons in outer 
space repel because of their equal charges and attract because 
of their masses. Yes, and they do fly apart! 


of certain forces and the masses of certain fundamental 
particles. Combining such numbers and the simple laws 
in which they appear gives interesting information about 
the characteristic scales that we observe in nature. That 
information is at best qualitative and heuristic, but it does 
provide useful insights about the expected domain of va- 
lidity of our theories. 


In this subsection we limit ourselves to the electromagnetic 
and gravitational force and what we can conclude from 
them with respect to the scales that we should associate 
with them, then we will add some quantum wisdom to it. 
These two forces are remarkable in that they both have an 
infinite range and the laws describing them are so-called 
‘inverse square laws. For the gravitational force between 
two masses we have Newton’s law, while for the electric 
force between two charges we have Coulomb's law: 


mim = l qiq2 
a 


Fe = —GyN (1.3.4) 


One might ask: Is there a way to compare these forces? 
Yes and no; they talk about essentially different things like 


masses and charges, so it’s like comparing apples and 
pears. However, it is not as bad as that because nature 
has given us particles that have both mass and charge — 
they are both apples and pears so to speak — and these 
allow us to compare the strengths of the two forces in a 
meaningful way. In Figure 1.3.3 we show two electrons 
which have charge e and mass me . They attract because 
of their masses and they repel because they have equal 
charges. If they met in outer space they would experience 
two opposite forces, so the key question is: will they pair 
up or fly apart? To get the answer, we have to take the 
ratio of the magnitudes of the two forces, 


Fc _ 4neoGNmMm m2 ~ 19-43 


Fe qiq2 


(1.3.5) 


This shows that the gravitational force is phenomenally 
weaker than the electric force. Note that this ratio does not 
depend on the distance; it is a fixed number. How sad for 
the electrons, it is not only hard to stay together; it would 
even be extremely hard to meet in the first place. The or- 
der of magnitude of this number holds for any fundamental 
particle which carries both mass and charge, though the 
actual number could differ of course. 


An electromagnetic size: how big is an electron? Given 
the force between two charges one can calculate the in- 
teraction energy of two charges that are separated by a 
certain distance. One may also define what is called the 
seif-interaction energy of a particle due to the force field. 
This electrostatic self energy is the energy it costs to build 
up a charge e on a sphere of radius r , and is of the order 
of e?/47tepr . Building up a charge means that you bring in 
infinitesimal amounts of charge from infinity and calculate 
the interaction potential. Equating that potential energy to 
its mass energy mec? according to the famous Einstein 
formula yields the classical electron radius in terms of its 


mass: 


e2 


=— ~ 8 e107? m. 
© 4neomec? ? a i 


r 


This expression is directly obtained from combining certain 
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constants of nature known from experiments with naive di- 
mensional analysis, and begs for an interpretation even 
though it would be heuristic. It certainly is a size one 
can naturally assign to a charged particle as it reflects the 
energy of the total electric field carried by a charge on a 
sphere of radius re. Note that the electromagnetic size of 
a charge grows with charge but decreases with increas- 
ing mass. So the lightest particle having a certain charge 
yields an upper-bound for the electromagnetic size of such 
a charge. Note the paradoxical nature of this classical rea- 
soning, it would produce an infinite potential if one would 
assume the particles to be point like. This fact stood out 
as a fundamental limitation of the classical theories in the 
description of a charged particle, and as we'll see in Vol- 
ume Ill, Chapter Ill.4, quantum field theory provided an es- 
sential new perspective on this question that exploited the 
sophisticated notion of renormalization. Anyway, accord- 
ing to the reasoning we have followed so far, a particle of 
zero charge could still be considered point-like. 


A gravitational size: know your horizons! As for a neu- 
tral particle electromagnetic considerations are void, so 
one could maybe make use of the gravitational interaction 
to set a scale, and assign a classical gravitational radius 
to any mass m. One repeats the argument and replaes 
Coulomb’s law by Newton’s gravitational law, ignoring for 
the moment the sign difference?, so the potential energy of 
amass mat radius r would be E ~ Gym2/r, and equating 
this to the mass energy E = mc’, we get rg ~ Gum/c?. 
This relation sets a scale for the applicability of classical 
Newtonian gravity, and indeed, remarkable enough it is 
(up to a factor 2) equal to the Schwarzschild radius of a 
particle of mass m defined as: 


= 2Gnm . 


Rs rae (1.3.6) 


that is ~ 10°” m for the electron. This is an excruciatingly 


3The sign would translate in the statement that the the potential cor- 
responds to the energy needed to gradually bring the mass to infinite 
radius. 


small number, far outside of the scope where our physi- 
cal intuition has any experience, let alone any bearing. It’s 
like somebody getting up and starting to talk to you about 
what they are planning to get done in the next one billionth 
of a second! Stay normal please! The Schwarzschild ra- 
dius is where the gravitational horizon around a black hole 
with mass m is located, and according to the general the- 
ory of relativity, there is no information that we, as outside 
observers, can obtain about the interior of the black hole. 
Talking about a particle’s properties beyond that scale is 
problematic. If you would send a willing observer to check 
out the interior they would not be able to report back to 
you, as they are doomed to a not so gracious exit facing 
the singularity at the origin. 


No escape: apocalypse you! To clarify this peculiar prop- 
erty of black holes, it suffices to repeat the thought experi- 
ment that the French mathematician Piére-Simon Marquis 
de Laplace described in 1796, and that lead him to the no- 
tion of the corps obscur, which in modern parlance is just 
a black hole*. You probably are familiar with the notion of 
escape velocity, if you throw this book straight up in the air 
it will under most circumstances drop on your head some 
time later. Yet, if you throw it with a speed of more than 
11 kilometers per second, then it would never return. As 
you see it is not so simple to get rid of a book, they tend 
to stick around. Far away it would still feel the gravitational 
force caused by the mass M of the Earth, but it can es- 
cape because the kinetic energy would be larger than its 
gravitational binding energy to the earth. Equating the ki- 
netic energy and the binding energy gives the equation for 
Vesc , we obtain: 


MV35¢/2=MGNM/t > vese = V2GNM/r. (1.3.7) 


Note that this velocity does not depend on the mass m 
of the book, so anything you throw up with a velocity ex- 
ceeding 11 km/s will be gone for ever. You see that the 


*A British natural scientist, John Mitchel, had already made a similar 
argument in 1783. He called the objects dark stars. 


THEORIES OUTSIDE THEIR COMFORT ZONE 


131 


Figure 1.3.4: A black hole. General relativity tells us that if we 
put a lot of mass in a tiny volume that mass will collapse under 
its own weight and form a black hole. Black, because its escape 
velocity exceeds the speed of light and — at least classically — 
no information can escape. A virtual sphere called the event 
horizon will form outside of the mass, and its radius corresponds 
to the Schwarzschild radius (1.3.6). 


escape velocity would increase if we would decrease the 
radius of the Earth while keeping its mass fixed. And in- 
deed, knowing that the velocity of light was approximately 
300.000 km/s . Laplace basically asked himself the ques- 
tion: to what radius do we have to shrink the size of the 
Earth in order that the escape velocity would become equal 
to the velocity of light? And beyond that radius, he argued, 
even light would not be able to escape from the Earth’s 
surface — the Earth having the size of a marble by the way. 
No light signals could be sent to some far away observer, 
at least they would not get very far. The Earth would be 
black: a black hole so to speak. Though this tiny Earth 
would be invisible, you would still be able to probe its pres- 
ence gravitationally. If the Sun were a black hole, you 
wouldn’t be able to see it but the planets would move in 
their orbits all the same. Going back to the formula (1.3.7), 
you'll also agree that an object with any given mass M will 


Figure 1.3.5: A black hole picture. This is a real picture of a 
black hole in the galaxy M87 about 5 x 10*°km away. It mea- 
sures 4 x 10'°km across, and has a mass corresponding to 
6.5 billion solar masses. The picture was captured by the Event 
Horizon Telescope (EHT), a network of eight linked telescopes 
on Earth. 


have an event horizon once its size is small enough. With 
black holes one tends to think of super massive objects 
like heavy stars. After having burned up their nuclear fuel 
they would collapse under their own gravitational attraction 
in a Supernova event. The compact object staying behind 
would indeed be a black hole. Astrophysicists have in the 
meantime identified large numbers of them. They also are 
located at the center of galaxies. One suspects the pres- 
ence of giant black hole gobbling loads of stars for break- 
fast. At first nobody could think of compactifying a chunk 
of matter like the Earth to within a radius smaller then its 
horizon of less than a centimeter. The concept of a black 
hole was so totally inconceivable that it was discarded as 
a brilliant fiction of the mind — clearly an artefact of fancy 
mathematics. The idea was that a condensed state of mat- 
ter, like the space inside of a stone or a lead block, would 
be ‘filled up’ completely. It would only be only compress- 
ible to a limited extent, which seemed evident from just 
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experimenting with it. There would be no room for such 
extreme collapses was the prevailing opinion, which was 
even held by influential astrophysicists like Sir Arthur Ed- 
dington. 


In fact the story went the other way around. With the ad- 
vent of the quantum understanding of the deep structure 
of matter, it was that intuitive idea that matter fills space, 
which turned out to be a fiction of the mind. Quantum the- 
ory taught us exactly the opposite, that matter is mostly 
empty space. The mass of a stone is carried mostly by 
the tiny nuclei inside the atoms and the spacing between 
those nuclei is about a million times larger than their size. 
Removing that space, you could in principle compress the 
Earth to the size of a meter across — the density would then 
correspond to the density of a neutron star. Astrophysicists 
have systematically studied the processes of stellar evolu- 
tion, including their dramatic ending. A star will, depend- 
ing on its mass, end up as a compact object, like a white 
dwarf, a neutron star, or a black hole. Most black holes ob- 
served have masses between five and several tens of solar 
masses, and the lightest known black hole has a mass of 
around 3 solar masses. 


A second comment to make is that small masses also 
have a horizon, which makes it possible to study mini black 
holes in order to find out to what extent they could be pro- 
duced and would be stable. Maybe also these hypotheti- 
cals — fictions of the mind — will be found one day in spite 
of them being ‘invisible.’ 


The age of our universe. Before continuing our black 
hole adventure where science is running into at least one 
of its own horizons — if not a brick wall — we briefly return 
to the other side of General Relativity (GR) connected with 
the Friedmann cosmology, which we discussed quite ex- 
tensively in the previous chapter. The question we want to 
address is a question that has puzzled humankind already 
for millennia, but at the same time it is also a question that 
children start asking when they are in elementary school. 


Did the world always exist, or was there a beginning — a 
moment of creation? And if so, when was that? Such 
questions that everybody encounters at some point in their 
life create a demand for answers, and where there is de- 
mand, economists tell us that there will be supply. And so 
there was! 


There is a great history of estimating the age of the uni- 
verse. The early estimates from a smart clergyman who 
managed to argue from The Scriptures that the week of 
creation was about 4000 years ago are well known. The 
story is that the Bishop James Ussher around 1650 came 
even with a precise date: Sunday 23 October 4004 BC! 
What you can say about the history of would-be answers 
is that there was an overall trend to ever increasing num- 
bers. 


It is interesting to recall the involvement of the great biolo- 
gist Charles Darwin who estimated the age of the earth by 
using geological arguments combined with the time needed 
to have the complexity of life evolve, to be a few hundred 
million years. This estimate was heavily criticized by Lord 
Kelvin who argued that the age of the sun, based on the 
state of knowledge — or ignorance — of the day, could not 
be more than say 20 or 30 million years. His knowledge 
typically comprised Newtonian gravity, chemistry and ther- 
modynamics, and his ignorance was hidden in the fact 
that he didn’t know that he didn’t know. The unknown un- 
knowns concerned the whole field of nuclear physics, be- 
cause there was none in those days. And to understand 
the age of the Sun you have to understand the nuclear 
processes that keep the Sun shining. This was an exem- 
plary scientific debate, where Darwin got much closer to 
the correct answer for reasons that are clear now. In the 
second half of the twentieth century the astronomers en- 
tered the game using a variety of observational and cal- 
culational methods. This caused the numbers to go up 
dramatically into billions of years. Fortunately the results 
also started to converge. 


THEORIES OUTSIDE THEIR COMFORT ZONE 


133 


Let us try to make a crude estimate of the age of the uni- 
verse starting from the Friedman equation (1.2.9). For sim- 
plicity we assume that the universe is flat (k = 0) and has 
only matter in it. The matter density drops inversely with 
the volume, so: 


_ m 
Pm = Perit —3 . 
a 


with perit is the critical energy density and Qm the present 
relative matter constant. The equation then simplifies con- 
siderably and we get: 


d 
va, SHO. (1.3.8) 
As you can check by differentiating, the solution is a = 
Bt? where ßB is some constant. This yields for the Hubble 
parameter H(t) = 2/(3t) . Evaluating this for tọ we obtain 
Ho = 2/(3to) , resulting in the estimate, 
2 
to =s— x 9.3 x 10° 
0 3H 3x 10’ yr, 

where we have used the value Hy = 70kms~! Mpc"!. 
This crude calculation thus shows that the age of the uni- 
verse is of the order of the inverse Hubble parameter. The 
best value available today, extracted from the 2018 data of 
the Planck space telescope is: 


to = (13.781 + 0.020) x 10yr. 


Note the amazing precision here, which shows the tremen- 
dous progress in the field of observational cosmology! This 
means that the Hubble parameter is a fundamental ob- 
servable as it sets the scale for the age of the expanding 
universe. 


Going quantum 


The quantum size of a particle. So far we used the equa- 
tions of classical physics and relativity, which involved the 
fundamental constants Gy , e, and c. What happens if we 


include some of the basic quantum relations? This would 
add Plank’s constant h (or its reduced version ħ = h/2z71, 
denoted as ‘h-bar’) into our deliberations. 
A nice starting point is the expression that Louis de Broglie? 
in 1923 proposed for the wavelength A of the ‘matter wave’ 
associated with a particle of mass m moving with velocity 
v or momentum p = mv, which simply reads A = h/mv. 
Combining this formula with Einstein’s dictum that nothing 
can move faster than light implying that v < c, we arrive 
at a ‘minimal wavelength’ 

Ac = a ; (1.3.9) 

mc 

for a quantum particle, which is called its Compton wave- 
length. The Compton wavelength for the electron is 2.43 x 
10712 m, which on a heuristic level can be interpreted as a 
measure for the ‘quantum size’ of the electron. For scales 
much larger than the Compton wavelength we can safely 
consider the electron as a well-defined localized ‘particle’ 
whereas when we approach the Compton wavelength we 
have to take its wavy nature into account and treat it quan- 
tum mechanically. In other words also in quantum theory 
the notion of a point particle breaks down beyond a certain 
scale. A rigorous way to define the Compton wavelength is 
to say that it equals — following Einstein — the wavelength 
of a photon whose energy equals that of the rest energy of 
a particle: E = hc/A = mc? . This is certainly true but less 
straightforward to interpret. 


Alternatively we may invoke Heisenberg’s uncertainty rela- 
tion Ax Ap > h/2, which in words amounts to the state- 
ment that in a given quantum state of a particle the uncer- 
tainty in the outcome of a position measurement times the 
uncertainty in the outcome of a momentum measurement 
equals at least h-bar over two. And if we then interpret 
mc as the maximum uncertainty in momentum that leads 
to the Compton wavelength as the minimal uncertainty in 


5In fact | should have said: Louis-Victor-Pierre-Raymond, the 7th 
Duke of Broglie! 
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Figure 1.3.6: The Rutherford model of the atom. The canon- 
ical picture of the atom proposed by Ernest Rutherford, with a 
positively charged nucleus consisting of protons and neutrons, 
with some negatively charged electrons orbiting the nucleus. It 
is a symbolic representation, which is misleading in two ways. 
The relative sizes are totally out of proportion, since the size 
of the orbits is about 100,000 times larger than the size of the 
nucleus. So if you take the nucleus as depicted in this figure 
the electron orbits would be about a kilometer in size! Further- 
more, in the stationary states of the atom, the electrons are not 
at all localized like point particles. The states rather correspond 
to the ‘standing’ wave patterns proposed by Bohr as indicated 
in the next figure. They represent the smeared out probability 
distributions for finding the electron at a given location. 


position of the particle. As we will see later this scale is di- 
rectly linked to the width of the ‘wave packet’ representing 
the electron in quantum theory. And with a quantum leap 
in vagueness you could argue that minimal uncertainty in 
position indicates the effective size of the quantum equiv- 
alent of a particle. 


Heuristics and reasoning by analogy is a dangerous game 
but can be enlightening and yields a rough sense of the 
scales involved with little work, not more than that. There- 
fore very useful! 


Figure 1.3.7: The Bohr atom. This model has quantized orbits, 
satisfying the constraint that the electron wave would fit an in- 
teger number of times on the orbit. This condition nA = 27ra 
leads to states with quantized energy and angular momentum. 
The radii of the successive orbits scale quadratically (~ n? ). 


An atomic size: the Bohr radius. There is one more 
quantum scale that we should mention at this point. It is 
the first quantum estimate of the atomic size called the 
Bohr radius. In 1911 Rutherford had shown that the atom 
has an almost point-like positively charged nucleus with 
the electrons orbiting around it. This brought Niels Bohr to 
his famous atomic model that, with its simple but radical 
starting point, immediately led to an astonishingly deep in- 
sight in the line structure of atomic spectra. Indeed, it is 
one of the most outstanding results of early quantum the- 
ory. The argument used the wave character of the electron 
(say De Broglie’s formula) to quantize atomic orbits, and 
thereby also its energy levels. From these energies the 
frequencies of the lines in the spectra could be calculated 
directly. 


Bohr used the idea of particle-wave duality, and put it into 
practice by assuming that the stationary electron states in 
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the tiny atomic environment would correspond to standing 
waves on a supposedly classical orbit. That wave should 
not destructively interfere with itself and therefore Bohr de- 
manded that the electron wave would fit an integer number 
of times on the orbit of the electron, which led him to the 
‘quantization condition’: nA = 27r with n = 1,2,3,.... 
If you now use the relation of De Broglie between mo- 
mentum and wavelength you get p = h/A = nh/(2zr) = 
nh/r. A straightforward exercise in Newtonian mechanics 
shows that in order to have a circular orbit you need a cen- 
tral force Fe = ma = mv? /r , which in this case is provided 
by the Coulomb force Fe of equation (1.3.4). So, from the 
equation Fe = Fe one finds the possible radii rn : 


e 
Amtegr2 ’ 


T mr mr 


from which Bohr derived the quantization rule: 


Arte oh? 
me? 


t= aon?; with ao = T1 = ~5.3 x107" m, 


where the constant dp in honor of its creator is called the 
Bohr radius. In Figure 1.3.7 we have sketched the periodic 
electron waves for the first few orbits. 


It is no surprise that the quantization of the orbits implies 
that other physical quantities are also quantized, notably 
the energies and the angular momentum. To start with the 
latter, for a circular motion we have that the angular mo- 
mentum L = rp, and just substituting the quantized value 
for p given above, one gets L = nh, showing the basic 
integer quantization condition for orbital angular momen- 
tum, which indeed has the dimensions [kg m2/s] of angu- 
lar momentum. Substituting the radius in the expression 
for the total energy E = Ekin + Epot = p?/2m + Veou, 
one finds that the energy is quantized as 


En = Ei /n?, (1.3.10) 


where the ground state energy is given by: 


mef 


= —___ ~ -]3.6eV. 
32n7 65 RŽ 


Ey (1.3.11) 


We see that the energies of the hydrogen atom are neg- 
ative (meaning that they are bound states) and that for 
large n the states pile up towards E = 0. An essential 
feature of the model which, also depicted in Figure 1.3.7, 
is the proposition that when an electron makes a transi- 
tion from a higher to a lower orbit, the energy difference 
AE will be carried away by a photon that has a frequency 
hv = AE. We return to the Bohr model and its relation- 
ship with the observed atomic line spectra in the section 
on atomic structure in the next chapter. 


Further gaming with fundamental scales. Returning to 
typical length scales related to the electron, we have so 
far cooked up three sizes: (i) the classical electromagnetic 
size (= the classical electron radius) re ~ 107! m, (ii) 
the gravitational radius (= the Schwarschild radius) R; ~ 
10777 m, and (iii) the quantum scale (= its Compton wave- 
length) A. ~ 10-'2 m. One thing these numbers clearly 
suggest is that in worrying about the size of the electron 
we should first take into account quantum effects, before 
entering into profound debates on the meaning of its clas- 
sical electromagnetic or gravitational radii. 


What else can we do with these length scales? We could 
take their ratios and try to interpret them. We can for exam- 
ple define the dimensionless ratio of the (i) and (iii). This 
number (up to the factor h/h = 271) is denoted as « and 
called the fine structure constant x = e?/47te9 he; it is 
indeed a pure number and equals « ~ 1/137. This con- 
stant is a clean measure of the interaction strength of the 
electromagnetic interaction in (relativistic) quantum theory, 
which is not so surprising because the fundamental con- 
stants e, c andh feature in it. 


Another dimensionless ratio one can take is the Comp- 
ton wavelength over the Bohr radius, giving an idea as to 
what extent the electron would fit in the atom. One finds 
that Ac/aọ = 27a, which is again proportional to the fine 
structure constant. This indicates that the two scales are 
not vastly different, particularly if one takes into account 
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that the electron in the Bohr-atom is non-relativistic and 
the Compton wavelength an underestimate for its quantum 
size. This underscores once more that we should treat the 
problem of atomic structure with quantum theory. 


We could also define a gravitational fine structure con- 
stant as &g = Gm2/ħc , which would equal X% = 1.75 x 
10-* . The ratio of these two ‘structure constants’ show, 
which also equals the ratio of (iii) and (i), bring us back to 
the intrinsic difference in coupling strength that we men- 
tioned before: the gravitational attraction of two electrons 
is weaker than their electromagnetic repulsion by some 43 
orders of magnitude. 


A quantum bound on processing speed. The uncer- 
tainty relations allow one to construct heuristic quantum 
bounds on various dual observables like momentum and 
spatial extent, or energy and time. The former yielded the 
Compton length, and the latter allows us for example to 
set an ultimate bound on processing speed. If we take the 
energy corresponding to a mass E = mc? and relate that 
energy to a fundamental frequency according to E = hv 
and interpret this frequency as the number of logical op- 
erations per second, we arrive at the formula proposed by 
Seth Lloyd in a 2000 paper for the maximal number of tran- 
sitions N* per unit mass per unit time. 


N*¥xv=c?/h. (1.3.12) 


Putting in the numbers one arrives at the ultimate process- 
ing speed of a ‘one kilogram laptop’ as some 10°° logical 
operations per second. To give you an idea of what this 
means typical estimates for the human brain yield 10), 
while the most powerful super computers run at 10—100 x 
10! flops. These comparisons are rather misleading be- 
cause of the very different structure of these ‘machines;’ 
the brain has a relatively low clock speed of about 100Hz 
but works in a highly parallel mode. 


Nuclear forces: the story of weak and strong. Later on 
in the book we will discuss two other forces which are not 


of the inverse square type: the strong and weak nuclear 
forces They differ in an essential way in that they effec- 
tively only act over small distances — meaning, small com- 
pared to the size of an atom — and that is why we don’t see 
or feel them. These forces can be approximated by a in- 
verse square law, which is cut-off at a certain characteristic 
scale, called the strong and weak scales respectively. The 
effective potential of a weak or strong charge corresponds 
to the so-called Yukawa potential: 


, 
Vy = gy-e (1.3.13) 


We see that the inverse 1/r potential standard for grav- 
ity and electromagnetism with a strength gy is multiplied 
with a negative exponential of the distance. The interac- 
tion potential is said to be screened and becomes vanish- 
ingly small past the typical scale A, , appearing as an addi- 
tional fundamental parameter in the theory. Such screen- 
ing effects make the interactions effectively short range. 
The particle’s experience would be comparable to driving 
in a dense mist or calling each other in the crowd, the 
interaction between entities is only effective at short dis- 
tances. We have depicted the Yukawa potential in Fig- 
ure 1.3.8. The interpretation of the characteristic scale A, 
is, that it is inversely proportional to the mass m, of the 
particle that is mediating the nuclear force, like the photon 
mediates the electromagnetic force. The natural relation 
between a mass and a characteristic length is the quantum 
scale or Compton wavelength of the particle as pointed out 
before. 


To complete this short interlude on the nuclear forces, let 
us just give you the scales involved. The strong nuclear 
scale is in the first approximation associated with the ex- 
change of so-called pion particles, their masses are of 
order mz ~ mp, yielding a length scale of approxi- 
mately 10~'4cm, which typically is the size of a nucleus. 
For the weak nuclear force the mediating particles are the 
W and Z bosons with masses Mw ~ 100m, and con- 
sequently the weak force has a tiny range of the order of 
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f(x) = exp(-x) 


| f(x) = - expl-x)/x | 
f(x) = -1/x 


Figure 1.3.8: The Yukawa potential. The purple curve is the 
product of a (—1/x) potential in blue (like the one of electro- 
magnetism or gravity) and an exponential suppression factor 
(exp —x) in red. What results is an attractive potential for the 
weak and strong interactions, which is effectively short-range 
because of the exponential cut-off. 


Aw œ 10 cem. 


At this point we may also mention that setting the mass 
of the mediating particle to zero we get back to the for- 
mulas of electromagnetism and gravity (the blue curve in 
Figure 1.3.8). This confirms our earlier claim that these 
long range force fields are associated with the exchange 
of massless particles such as the photon and the gravi- 
ton. In that case the potential is a simple power law re- 
flecting the scale free nature of the long range interaction. 
The power law potential exhibits therefore what in modern 
parlance is called a /ong or fat tail, which refers to the be- 
havior clearly visible on the right in Figure 1.3.8 where the 
blue power law curve is much larger than the exponentially 
suppressed purple and red curves. 


Where the quantum collective rears its head. We started 
in Chapter 1.1 by summarizing the fundamental theories of 


classical physics, and we have indicated in this chapter 
how quantum theory enters to indicate the boundaries of 
the domain of validity of the classical theories of mechan- 
ics, gravity and electromagnetism. It will not surprise you 
that the theory of statistical or thermal physics has also 
an intrinsic parameter that tells you when quantum phe- 
nomena should be expected to become relevant in multi- 
particle systems such as gases and liquids. The inter- 
esting thing here is that these phenomena even occur in 
‘ideal’ systems where we ignore inter particle interactions. 
The central observation is again based on a simple dimen- 
sional argument. If one considers an ideal gas of mas- 
sive atoms in equilibrium at some temperature T , then the 
average thermal energy per particle is Ex, = 3kT/2. So 
we can define a thermal momentum pn through the rela- 
tion: 


7. Oki 
7 ==. (1.3.14) 


Next we use the De Broglie relation to define a thermal 
wavelength as Aw, = h/pih = h/V3mkT. This length 
scale depends on h and defines the size of the wave pack- 
ets related to the thermal excitations of the particles in a 
gas. For the case of particles in a gas at room temper- 
ature the thermal wavelength is typically of the order 0.1 
Angstrém or 107!! meters. When does this scale become 
relevant? It clearly matters if it becomes of the order or 
larger than the typical inter-particle distance d, which is 
determined by the particle number density n defined as 
n = N/V. Classical considerations (even in the absence 
of interactions) should break down if: 
hn!/3 


`n > d > -> 1; 


(1.3.15) 
3mkT 


The conclusion is that we enter the quantum domain at 
high density and/or low temperature. As we will discuss 
later on, this is intimately linked to spectacular quantum 
phenomena like superfluidity, (super-)conductivity, and Bo- 
se-Einstein condensation. A rough indication of some ex- 
amples where the quantum laws are inescapable can be 
found in table 1.3.1. 
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Table 1.3.1: Thermal wavelengths and domains* 


System T[K] | Ain/d | domain 

Air at room temperature | 300 | 0.006 | classical 
Liquid nitrogen (He) | 77 | 0.10 | classical 
Liquid helium (*He) 4 1.16 | quantum 
Electrons in copper 300 | 18.9 | quantum 


* R. Baierlein, Thermal Physics, Cambridge Un. Press (1999). 


Natural units ©1898 Max Planck 


We conclude by discussing a system of natural units in- 
troduced by Max Planck. The price to pay is to give up 
anthropocentricity, at least on the level of units. 


Can we please finish this endless talk about units? Yes, we 
can! It was already pointed out by Max Planck how to do 
this in the Fünfte Mitteilung, Uber irreversible Strahlungs- 
vorgange, to the Preussische Akademie in 1898°: 


Alle bisher in Gebrauch genommen physikalischen 
MaBsysteme, auch der sogenannte absolute C.G.S.- 
System, verdanken ihren Uhrsprung insofern dem 
Zusammentreffen zufalliger Umstände, als die Wahl 
der jedem System zu Grunde liegenden Einheiten 
nicht nach allgemeinen, notwendig für alle Orte und 
Zeiten bedeutungsvollen Gesichtspunkten, sondern 
wesentlich mit Rücksicht auf die speziellen Bedürfnisse 
unserer irdischen Kultur getroffen ist. 


In the Mitteilung he devised a system of units that de- 
serves the qualification natural like no other. These Planck- 


ĉEnglish translation (by author): All physical systems of measure- 
ment, including the so-called absolute CGS system, which have hith- 
erto been used, owe their existence to accidental circumstances, in that 
the choice of the units on which each system is based does not depend 
on general points of view that necessary hold for all places and times 
but takes only in consideration the special needs of our earthly culture. 


units are all directly linked to the simple universal con- 
stants that we discussed before: 

— the gravitational constant Gy with units [kg~'m3s~7] , 
— the speed of light c ~ [m s71], 

— Planck's constant’ h ~ [kg m*s~"] 

— and Boltzmann’s constant k ~ [kg m?s~7]. 


Some juggling with dimensions leads quite unambiguously 
to the following natural units: the Planck-unit of length or 
the Planck length, 


=o" 1.62 x 10-8 em 


the Planck mass, 


and the Planck time, 


_ =5.39 x 10" s 


If we include the Boltzmann constant k as another funda- 
mental constant, we may add the Planck unit of tempera- 


ture: 
| he? 32 
Tp = vc eE K 


Divine units indeed! Imagine, adopting these as the units 
of length, mass, time and temperature amounts to set- 
ting all the above expressions in terms of the fundamen- 
tal constants equal to one, which implies that we have 
to seth = c = k = 1 in all formulas and calculations! 
What a relief for the students who have to remember them. 
| am afraid though that in the real world of construction 
and electrical engineers these units would be despised 


7In Planck's original paper this or better his constant was actually 
called b and noth. 
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except in the rare instance where one is involved in build- 
ing universes®. This is precisely the case because using 
these divine units would involve such huge conversion fac- 
tors that you would lose a common sense of scale.. On 
the other hand, dragging along all these fundamental con- 
stants all the time makes formulas far less transparent and 
that clutters the mind. | challenge the entrepreneurial read- 
ers to choose natural units for the rest of this chapter, 
which means that you set the universal constants every- 
where equal to one. You will find that the resulting formulas 
become stunningly simple indeed. 


Ahead of the crowd. The natural units beg for an interpre- 
tation and maybe it is just that they mark the domain of va- 
lidity of the theories of Einstein and/or quantum theory. Or 
better, they mark a domain where quantum and relativity 
typically meet. Problems at the Planck scale involve phe- 
nomena where quantum gravitational effects have to be 
included. And if we do not have a complete understanding 
of what the quantum theory of gravity is, our calculations 
will be unreliable to say the least, and may give unsatis- 
factory answers to sensible questions. Referring back to 
the scales we discussed before we see that for a funda- 
mental particle with a mass equal to m,, the Compton 
wavelength and the Schwarzschild radius become roughly 
equal since: 


ħ/mpc = Gnm,/c’. 


This expresses the fact that for a particle with a mass of 
the order of the Planck mass the quantum uncertainty in 
its spatial extent is the same as its ‘gravitational’ uncer- 
tainty. This gravitational uncertainty is due to the strong 
gravitational field which causes that it is impossible to ex- 
tract information on the outside of that tiny horizon about 
what happens inside. The equal sign in the above equa- 
tion, inspired by matching uncertainties, basically makes 
the bold hypothesis that both uncertainties are somehow 


8Surprisingly, quite a few engineers appear to do so in their spare 
time. | would rather have engineers constructing universes, than 
philosophers building airplanes! 


due to the same underlying mechanism. Such a mecha- 
nism would have to be accounted for by a would-be theory 
of quantum gravity. 


It is worth remembering that heavy objects have a Comp- 
ton wavelength that is negligible for example for the earth 
we get that Ag = h/mac œ 10-* cm, while its Schwarz- 
schild radius still is a respectable Rẹ = 0.9 cm. And be- 
cause both are so much smaller than the actual size of the 
object Earth, it is not in terrestrial physics that this funda- 
mental contradiction will leave any mark. For the electron 
the situation is the opposite, Ae = h/mec ~ 10712 cm (in- 
deed its non-local character manifests itself on the atomic 
scale), while the Schwarzschild radius is an excruciatingly 
small Re = 2Gnme/c? ~ 10-°’m. The conceptual conflict 
between relativity and quantum theory as encountered at 
the Planck scale signals the crisis our notion of space-time 
suffers in the light of the quantum postulates. On the other 
hand, one might hope that also this crisis will be the seed 
for a new fundamental paradigm. 


Black holes 


The question is not whether black holes ‘exist’. They 
exist as classical objects. The question is, what is 
the quantum mechanical equivalent? It is well pos- 
sible that in quantum mechanics black holes are 
no longer strictly distinguishable from more con- 
ventional forms of matter. 

Gerard ’t Hooft, Physica Lecture (1995) 


It is widely believed that black holes are rather esoteric, far- 
fetched, out of this world, nerdy gadgets and therefore not 
so relevant. Wrong! It has become ever more clear that 
they are the principal key to a new and much deeper un- 
derstanding of what gravity and thus space-time are really 
about. It introduced the concept of information into phys- 
ics in a fundamental way. Indeed, some people say that 
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Figure 1.3.9: Information loss? Does information get lost for- 
ever when it falls into a black hole? Detail of the remarkable 
sculpture Le Nomade in Antibes (France) created by the Span- 
ish sculptor Jaume Plensa 


‘black holes’ are for the theory of gravity, what the ‘hydro- 
gen atom’ was for quantum theory. Gravity's fundamental 
properties and problems really show up in the unexpected 
intricacies of black hole geometry. There is more to grav- 
ity than dropping a teaspoon on the floor, or keeping the 
moon in orbit. We have already seen in the previous sec- 
tion the peculiar property of horizons predicted by GR. But 
just saying that there is a horizon is not sufficient. When 
you start thinking about it seriously, a lot of hard questions 
come your way, questions that probe the deeper grounds 
of GR and go beyond it. This section touches upon the 
remarkable research by many of the brightest brains of re- 
cent generations, attempting to bridge the gap between 
curved space-time and quantum theory. This appears nec- 
essary to get a complete and consistent picture of these 
miraculous outposts of reality. Yes, horizons mark an im- 
portant frontier, but to what? 


Stephen Hawking: Quantum black holes are not black! 
We have discussed the essential feature of GR that ev- 


ery mass M has a Schwarzschild radius R, associated 
with it. If the size of the massive object is smaller than its 
Schwarzschild radius, then there will be a horizon around 
the mass located at R, . It is called a horizon because if a 
chair or for that matter Shakespeare’s collected works fall 
into the black hole through their gravitational attraction to 
its mass, then once they pass the horizon, there is no pos- 
sibility for them to return. Falling into a black hole there is 
a point of no return. And the points of no return form by 
definition the horizon. That raises the question what hap- 
pens to all that stuff disappearing in the black hole. Ein- 
stein’s theory says without further ado that it disappears in 
the ‘singularity’ located at the origin. But that is not what 
a far away observer sees, because they can not look be- 
yond the horizon. They only see the books approaching 
the horizon at an ever slower rate. Here a strange com- 
plementarity of perspectives arises, because for the in- 
falling ‘Hamlet’ or ‘Midsummer night’s dream’ nothing spe- 
cial happens as they would smoothly sail though the hori- 
zon. From that moment on their fate is decided, they will 
be swallowed by the singularity; no pardon can be granted, 
there is just no escape! 


So, altogether the physics of black holes was for a long 
time highly enigmatic, but also unsatisfactory, strangely in- 
complete and paradoxical to say the least. On the one 
hand, Einstein’s theory inescapably posed their existence, 
but on the other hand failed to answer many of the basic 
questions it posed. From the 1970s, fundamental break- 
throughs have been achieved in our understanding of black 
holes. Indeed it turned out that quantum theory had to 
come in to rescue and resolve some of the bizarre con- 
tradictions that black holes confronted the physicists with. 
Actually it was both quantum theory and information the- 
ory that played essential roles. It has become clear that a 
deep understanding of how black holes work on quantum 
level could provide the essential keys to a broad under- 
standing and interpretation of what a consistent quantum 
theory of gravity may ultimately look like. 
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Black hole thermodynamics 


It is natural to introduce the concept of black hole 
entropy as the measure of information about the 
black hole interior. 

Jacob Bekenstein (1972) 


At first there was the development of a ‘thermodynamics of 
black holes’ by Jacob Bekenstein and Stephen Hawking. 
Subsequently Hawking made the seminal discovery that 
when quantum processes are taken in account, the black 
hole is no longer black! Quantum processes that take 
place at the horizon and which are not allowed by classi- 
cal physics mean that the black hole will lose energy due to 
radiation coming of the horizon. Hawking was able to cal- 
culate the spectrum of this radiation and that turned out to 
exactly be the black body spectrum explained by Planck’s 
quantum hypothesis. It means that the mysterious object 
changed from a black hole into a radiating black body. The 
black hole is rather like a black ball kept at a certain tem- 
perature — appropriately called the Hawking temperature, 
Ty. Let us try to get a grasp of the main components of 
the argument by recalling some basic concepts: 


(i) the thermodynamic relation due to Clausius between 
heat produced and entropy 
dQ = TdS; (1.3.16) 


(ii) the Boltzmann definition of entropy in terms of the num- 
ber of states 


S=kln W; (1.3.17) 
(iii) the Schwarzschild radius 
2Gn M 
a a (1.3.18) 
and (iv) the Planck length 
p = TEN (1.3.19) 
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Figure 1.3.10: Information on the horizon. An artist's impres- 
sion of bits of information on the horizon of a black hole. The 
information capacity would be one bit, or rather nat, per square 
Planck length. 


We start with the observation that classically, nothing can 
come out of the black hole, so if you drop an object with 
a certain energy and entropy into the black hole, the only 
thing you may observe is that the mass increases (and 
therefore the Schwarzschild radius), but the information 
content would be lost forever. The idea now is to asso- 
ciate the mass-energy Mc? with the heat term in equation 
(1.3.16) and the area of the horizon A with the entropy term 
on the right. 


Let us talk about spherical black holes, than A = 47RŻ . To 
convert this area into some entropy, let me define a Planck 
area ap , which we will choose as ap = Aly . The comment 
here is that the Planck length is the smallest length scale 
that is physically meaningful, which means that this Planck 
square is the smallest physically accessible area (as | am 
giving a heuristic argument | allowed myself to put in the 
extra factor 4 for convenience). This means that we as- 
sume that a single Planck square corresponds to one nat 
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of information’, 


A Ac? 
=k 1.3.20 
= Nag" any: ween 


which is exactly the expression first written down by Beken- 
stein and Hawking. In this perspective a black hole would 
look more like a spherical digital memory as indicated in 
Figure 1.3.10. 


This is a surprising result, because, as entropy is asso- 
ciated with the number of degrees of freedom of the sys- 
tem, you would expect that in three dimensions the entropy 
within a volume bounded by some horizon would grow pro- 
portional to the volume and not to the area. This suggests 
that this rather fictitious, mathematically defined surface 
will somehow acquire an important physical interpretation 
if we take quantum processes into account. 


Hawking temperature. 


Quantum mechanical effects cause black holes to 
create and emit particles as if they were hot bodies. 
Stephen Hawking (1975) 


Next, we want to find the temperature of the black hole 
as a thermodynamic system. The internal energy is given 
by U = Mc? and the entropy S is given by the previous 
equation. That equation allows us to calculate: 


dS — dS dR,  8nR? c*k 
dM dR, dM M 4hGn’ 


(1.3.21) 


which relates a change in mass with a change in entropy. 
We may obtain the temperature by using the first law of 
thermodynamics dU = dQ — dW where the last term on 
the right-hand side is absent because a black hole doesn’t 


*With N states the information entropy is H =* log W [bits], the 
thermodynamical information entropy is defined by S/k = In W [nats]. 
We choose the nat-unit because we want to make the link to the natu- 
ral logarithm appearing in thermodynamics. 


Figure 1.3.11: Pair creation at the horizon. In a vacuum one 
always has quantum fluctuations in energy. As a consequence 
of Heisenberg’s uncertainty principle virtual particle anti-particle 
pairs will be created. Normally these have to recombine but 
on the horizon there is the possibility that one member of the 
pair falls in the black hole and the other escapes. This is the 
microscopic origin of the Hawking radiation. 


do any work, while for the first term we will use the expres- 
sion of (1.3.16). This yields the following expression for the 
internal energy: 


Re 


U = d(Mc?2) = 
S a 


dS. 


from which the temperature follows, 


he Re? 


Tu= = , 
H anR.k  8nGNuMk 


(1.3.22) 


This is indeed the temperature Hawking derived in his fa- 
mous paper from 1975, and which was therefore named 
after him. 


We only took the shortest and easiest route which sug- 
gests the result, but Hawking really proved that a black 
hole would radiate as a black stove at that temperature. 
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Figure 1.3.12: Hawking radiation. A black hole is not black as 
depicted in Figure 1.3.4, but rather as depicted here: a black 
body at a Hawking temperature T4, emitting thermal radiation. 


The idea was simple but brilliant and the calculation notori- 
ously hard. The starting point of the original derivation was 
the production of virtual particle — anti-particle pairs in the 
gravitational field near the horizon. He basically calculated 
the probability that one of the two would fall in, then they 
could not recombine, and the other particle would have to 
‘become’ real and would be able to escape. The quantum 
black hole would indeed start radiating and would lose en- 
ergy. Talking in terms of pictures, we should thus replace 
the image of the classical black hole of Figure 1.3.4 with 
the quantum version of Figure 1.3.12. 


Power emitted and life time estimate. The total power emit- 
ted by a black body, is given by the radiation law of Ste- 
fan Boltzmann (the ‘other’ Boltzmann) which states that 
the emitted power would be proportional to the tempera- 
ture to the fourth power and of course also proportional to 
the surface area, P ~ AT*, which using equantion (1.3.22) 
implies that the power emitted would be inversely propor- 
tional to the second power of its mass P ~ M~?. The black 
hole loses mass because of the Hawking radiation but the 


more mass it has lost the more it radiates. The final stages 
are therefore more like an explosion. A black hole would 
not be black anymore; on the contrary, left on its own in 
outer space it would evaporate until nothing, or may be 
only some unknown type of remnant, would be left! From 
the power dependence on the mass, we can make a rough 
estimate of the life time t of a black hole, with the following 
calculation: 


t(0) 0 t 
| at =| (ot) am ~ mè. 
t(Mo) 


(1.3.23) 


The conclusion is that the life time of a black hole would 
grow with its mass to the third power. 


To put this discussion of black hole evaporation in per- 
spective let us mention that the Hawking temperature for 
a solar-mass black hole would only be 60 nano Kelvin. 
Such a black hole presently located somewhere in our uni- 
verse would absorb far more radiation than it would emit, 
because the universe itself has at present a background 
temperature of 2.7K. This in turn is a consequence of 
the cooling of the universe by expansion, and is in fact a 
leftover from the hot Big Bang. So the Hawking radiation 
phenomena is profound but hypothetical in so far as there 
is little hope of being able to directly observe it. Conse- 
quently, though considered by many to be one of the great 
discoveries of twentieth century physics, Hawking was not 
eligible for a Nobel prize. 


Surface gravity. The beauty of Hawking’s discovery is that 
it strongly suggests that it is the horizon where the inter- 
esting physics of a black hole really takes place. But at the 
horizon we are still far away from the singularity and space 
time is smooth. This suggests that we should try to link 
the emerging temperature of the horizon to a local gravi- 
tational concept. The natural candidate would be what is 
called the ‘surface gravity, which is basically the gravita- 
tional acceleration denoted by g at the horizon. We mean 
the ‘universal’ gravitational acceleration Galilei was talking 
about. Indeed, an observer located at the horizon has to 
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be accelerated to stay there. Just like we are at rest at the 
earth’s surface. The reason we are not freely falling is be- 
cause the Earth’s surface accelerates us radially outward 
by exerting a normal force. The gravitational acceleration 
or surface gravity at the horizon is just given by (minus) 
Newton’s law: g = GM/R2. Substituting the Schwarz- 
schild radius and comparing with the Hawking temperature 
we find that the relationship between the Hawking temper- 
ature and the surface gravity is strikingly simple: 


_ hg 
~ Inck* 


Ty (1.3.24) 
It appears that what matters is the transformation from the 
frame of the observer, freely falling into the black hole for 
whom nothing special is going on, to the accelerated frame 
of the observer at rest near the horizon. 


Accelerated observers and the Unruh effect 


The above suggests that we should look at the world ac- 
cording to an accelerated observer. This yields another 
interesting, even more basic link between the structure of 
space-time and entropy/information known as the Unruh 
effect. Let us first establish that an accelerated observer 
perceives an horizon. If you transform flat Minkowski space 
to an accelerated frame you get the so-called Rindler coor- 
dinates which is depicted in Figure 1.3.13. The world lines 
of the accelerated observers are time-like hyperbolae. To 
be precise you should say that the world lines correspond 
to observers who experience a ‘constant force, because 
with F = ma and the fact that the mass in this formula 
is the relativistic mass increasing with the velocity, the ef- 
fective acceleration becomes smaller so that the velocity 
never exceeds c, as the figure shows. And that is exactly 
why the horizon is there. The future light cone of any point 
beyond the horizon (like the yellow arrow in the dark re- 
gion) does not intersect with the world line of the acceler- 
ated observer and therefore cannot be observed. 
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Figure 1.3.13: Accelerated observers. A space-time diagram 
for an accelerated frame of reference. The parameter s = c*/g 
is the inverse of the acceleration g. The world lines of observers 
initially at rest on the x-axis are time-like hyperbolae that asymp- 
totically approach the light cones. The white region is the Rindler 
space-time and has a future and past horizon. Light signals 
emitted from points in the dark region travel along straight lines 
under 45°, these do not intersect any world lines and therefore 
can never be observed by the accelerated observer. 


This suggests that also in this case including quantum pro- 
cesses à la Hawking will turn that horizon into a black body 
with a temperature given by the very same formula (1.3.24), 
linked to the acceleration g = c?/s of the observer. This 
remarkable result is known as the Unruh effect, named af- 
ter William Unruh who first presented the calculations lead- 
ing to the formula (1.3.24) in 1978. The proof for this case 
of an accelerated observer amounts to a rather straightfor- 
ward (quantum) calculation. We start in the rest frame with 
a quantum field describing some species of scalar parti- 
cle. The field is in a zero energy state, usually called the 
vacuum. In this state no particles are present, and that is 
what the observer at rest perceives. If then we make the 
transformation to an accelerated frame, the transformed 
distribution for the density of states corresponds to a ther- 
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mal energy distribution. This means that the accelerated 
observer will perceive a highly excited state with many par- 
ticles present. The spectrum obtained corresponds ex- 
actly to the Planck spectrum, the direct consequence of 
the quantization of energy. This is the spectrum that was 
hailed as one of the nails in the coffin of classical physics! 
This calculation by the way beautifully illustrates the rela- 
tivistic proverb: ‘truth is in the eye of the beholder, and in 
particular depends on his frame of reference. 


Pair creation of charges. In this context it is illuminating to 
think of the original Hawking argument. Consider a sim- 
ple two-dimensional (x.t) space-time where one may in- 
troduce a constant background electric field, say Eo ~ Fo 
in the positive x-direction. This — in two space-time dimen- 
sions — corresponds to a Lorentz invariant background en- 
ergy. In such a background field there is a certain quantum 
probability that charged particle anti-particle pairs can be 
created. Clearly these pairs would split up, the positive 
charge moving to the right and the negative charge to the 
left. They would experience a constant force F = teE and 
therefore accelerate, and both would correspond to ideal 
‘Rindler observers.’ 


The situation is depicted in the space time diagram of Fig- 
ure 1.3.14 with the two particles accelerating in opposite 
directions, with their velocities asymptotically approaching 
the speed of light. They are causally separated from the 
moment of their creation. The probability of the pair cre- 
ation depends on the threshold energy which corresponds 
to the sum of their masses, 2mc?, and the electrostatic 
energy of the pair depending on their distance d. The 
remarkable result is that the spectrum corresponds ex- 
actly to a thermal distribution matching the Unruh temper- 
ature. 


One other quantum aspect of profound interest in this ex- 
ample is the fact that in the quantum state in which the 
pair is created, the particles are entangled. The informa- 
tion of one member of the pair is inaccessible to the other 


Figure 1.3.14: Pair creation. In a background electric field, a 
particle anti-particle pair might be created spontaneously and 
the members of the pair would accelerate in opposite directions, 
being causally disconnected from their inception, each living in 
its own Rindler bubble. However in the quantum world the pair 
would be entangled which leads to a situation where each of 
them is dealing with a so-called mixed state. 


because it is hidden behind a horizon. This manifests itself 
in that each of the particles perceives being in a mixed or 
‘thermal’ state with a characteristic entanglement entropy. 
We will return to these concepts in chapter Il.1 but want to 
mention them already here. 


The fate of information. Thermal radiation is completely 
random (thus maximally uncorrelated and unconstrained) 
and therefore has maximal entropy. Here we arrive at a 
familiar point where by solving one question we pose the 
next one. The idea that quantum processes preserve all 
entropy and therefore information leads to a non-trivial up- 
grading of the information-paradox: \f we throw the entire 
Encyclopedia Brittanica in a black hole it will be converted 
into pure thermal radiation according to Hawking. Clearly 
that cannot be the case, where did all the correlations 
present in the incoming state go? 
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If we take a step back, we could compare the formation of 
a black hole (putting more and more mass on a star, until 
it becomes a black hole), and its successive evaporation, 
with a more familiar process (proposed by Sydney Cole- 
man) where we know that quantum processes conserve 
both energy and entropy. Imagine a piece of coal at zero 
temperature in a pure state where by definition S = 0, that 
gets irradiated with a fixed amount of high entropy radia- 
tion, which we assume is absorbed completely. It brings 
the coal into an excited state at a finite temperature. As a 
consequence the piece of coal starts radiating, it will even- 
tually return to the zero temperature state, with zero en- 
tropy. As the process of absorbing the initial radiation and 
emitting the outgoing radiation is a quantum process, it 
follows that the emitted radiation should have exactly the 
same entropy as the incoming radiation. And therefore no 
information could have gotten lost! 


The black hole instability. The conventional narrative is that 
by throwing an encyclopedia into a black hole, all informa- 
tion would be lost. With the appearance of quantum theory 
at the horizon, however, our view has radically shifted in 
the sense that the real physics of black holes is the quan- 
tum physics taking place at the horizon. Consequently the 
question of what happens to the information in the quan- 
tum context needs to be critically re-evaluated. 


The quantum principles tell you that if you were able to re- 
ally perform the full quantum calculations including the de- 
tailed effects of entanglement, which we haven’t discussed 
yet, then in that case you could in principle recover the en- 
tire incoming state. In other words, in the quantum domain 
it is extremely hard to really get rid of information, it may 
be hiding, but it still should be out there somewhere. By 
the way, to people having a Facebook account, this story 
may sound unpleasantly familiar. 


In principle black holes may exist for any mass, hence one 
may also consider microscopic black holes to rid oneself of 
the many astrophysical complications that are irrelevant in 
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Figure 1.3.15: The magic cube. Universal constants and the 
domains of validity of some fundamental theories. 


this context. The statement is that mini-black holes would 
evaporate very rapidly and therefore be very short-lived. In 
other words these mini black holes are states of matter that 
are bound gravitationally, but are unstable, just like many 
other massive ‘bound states’ happen to be. This instability 
gives rise to a finite life time, and the formation and decay 
process is a quantum process, often referred to as a ‘res- 
onance. In such quantum processes information is pre- 
served, much like the other conservation laws that physics 
obeys, like the conservation of energy, angular momentum 
and charge. So, Hawking’s crucial discovery has in the 
end led to a fundamental overhaul of our concept of black 
holes. And as a consequence the present view is there- 
fore that quantum theory supersedes general relativity in 
that information has to be preserved somehow. Yet, at this 
moment, a fully quantum mechanical account of the for- 
mation and subsequent evaporation of a basic black hole, 
which is the litmus test for claiming an understanding of 
quantum gravity, has not yet been achieved. Though with 
the advent of string theory as a serious candidate for such 
a theory, the perspective on black holes has progressed 
in impressive ways as we will indicate towards the end of 
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the next chapter. But the fact remains that science at any 
stage is just ‘work in progress.’ 


The magic cube 


The magic cube of turning points. Our story of scien- 
tific progress and hope, linked to the successive identifi- 
cation of fundamental constants of nature, is depicted in 
the ‘magic cube’ of Figure 1.3.15, which is a cube in the 
space of ideas. The magic cube has classical Newtonian 
physics on the back lower edge with the laws of mechan- 
ics (like F = ma) on the left, and his law for gravitation 
on the right. The constant Gy linking the two appears as 
the universal constant setting the scale for the strength of 
the gravitational interaction. The bottom square is the rel- 
ativity plane where the fundamental constant c (or better 
1/c) is added. The boundary of the domain where New- 
tonian physics is valid is reached as velocities become of 
the order of the velocity of light. The Newtonian limit cor- 
responds to 1/c — 0. Newton’s gravitational force law is 
instantaneous and therefore incorporates the notion of ’ac- 
tion at a distance’. This notion is incompatible with special 
relativity, where information and thus disturbances cannot 
propagate faster than the velocity of light. So special rel- 
ativity vetoes instantaneous non local interactions. This 
conflict was then brilliantly resolved by Einstein’s theory of 
gravity, the theory of general relativity. 


The vertical dimension opened up with the advent of quan- 
tum theory through the universal constant h. The classical 
(non-quantum) limit corresponds to A — 0. The vertical 
square on the left-hand side includes the modern unified 
quantum theories for all known forces and matter, except 
for the gravitational force. The top plane would include a 
quantum theory of gravity like string theory which so far 
has not been able to generate predictions that could be 
tested by experiment. A string theorist may argue that if 
you had started by postulating string theory, you would 
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Table 1.3.2: Some fundamental sizes and scales. 
Notion Formula Size [m] 
ee 
Class. electron radius | Te = ——— | ~ 10715 
ANEQMeC 
h =(2 
Compton wavelength | Ap = — ~10 
mc 
(e`) 
h 
Strong nuclear scale | Ar = — = OS 
Mre 
h =i8 
Weak nuclear scale Aw = ~10 
MWCS 
Ame oh 
Bohr radius Cs = ie = JOL 
Mee 
2G 
Schwarzschild radius | R = =S | ~ 107 
(Earth) 
hG 
Planck length lp = =a = I0” 
; 3 10 
Age of the universe i == ~ 10* yr 
2Ho 
h =i 
Thermal wavelength | Ath = ~10 
3mkT 
gas 300K 


have predicted gravity and the other interactions. Such 
theories unify the notions of matter and radiation with that 
of space-time. The magic cube illustrates how inconsisten- 
cies led the way to fundamental paradigm shifts. Such are 
the blessings of the inconvenient truths that keep popping 
up along the winding road of science. 


Conclusion. In this chapter we have celebrated the ‘back 
of the envelope’ philosophy and advocated for the virtue 
of heuristics and approximations. In science the ‘truth’ is 
a moving target, elusive like a holy grail because science 
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is by definition ‘work in progress. And if your work is in 
progress the notion of truth makes you feel extremely un- 
comfortable. Every new theory or model is just the next — 
more sophisticated — working hypothesis. But, as we have 
shown in this chapter, there is — as every engineer can tell 
you — a certain pleasure as well as value in playing with 
the numbers given to you, and applying some dimensional 
analysis to them. A purist may call it ‘recreational physics: 
And indeed, that’s what we were concerned with so as to 
get an idea of the relevant scales that are linked to the spe- 
cific values of our ‘universal’ constants. This game has 
provided us with some surprisingly deep insights about 
where quantum effects will rear their heads. 


Further reading. 
On scales in nature: 


— Mr Tompkins in Paperback 
George Gamow 
Cambridge University Press, 
Reprint from 1939 and 1944 editions (2012) 


Knowledge and Wonder 
Victor F. Weisskopf 
MIT Press (1979) 


In Praise of Science: Curiosity, Understanding, 
and Progress 

Sander Bais 

MIT Press (2010) 


On black holes: 


— Gravity's Fatal Attraction: Black Holes in the Uni- 
verse 
Mitchell Begelman and Martin Rees 
Cambridge University Press (2020) 


— The Little Book of Black Holes 
Steven S. Gubser and Frans Pretorius 
Princeton University Press (2017) 


Chapter I.4 


The quest for basic building blocks 


If, in some cataclysm, all of scientific knowledge 
were to be destroyed, and only one sentence passed 
on to the next generations of creatures, what state- 
ment would contain the most information in the fewest 
words? | believe it is the atomic hypothesis (or the 
atomic fact if you wish to call it that) that all things 
are made of atoms - little particles that move around 
in perpetual motion, attracting each other when they 
are a little distance apart, but repelling upon being 
squeezed into one another. 

R.P. Feynman (1961) 


A splendid race to the bottom 


The notion of what the basic building blocks of nature are 
has repeatedly shifted over time. Every time when a new 
layer of structure is uncovered a new set of ‘basic’ build- 
ing blocks is postulated. That way we turned from chem- 
ical elements to atoms, from atoms to the understanding 
of nuclear structure, and from nuclear structure to the ele- 
mentary subnuclear particles we know today. 


Three levels of simplicity. In Figure 1.4.2 we have indi- 
cated the subsequent paradigm shifts with respect to the 
fundamental building blocks of matter. It depicts the fron- 
tier of knowledge at three typical moments in the past cen- 


Figure 1.4.1: The human quest for understanding nature. 


tury, which one could call ‘three levels of simplicity’ The 
first is the level of atoms. The second, nuclear level stands 
out for its simplicity with only the electron, proton and neu- 
tron making up the atoms. The electromagnetic binding of 
the electrons to the nucleus was provided by the photon 
while the protons and neutrons were believed to be held 
together by a nuclear force that was mediated by the pion. 
But this picture is misleading because | left out the ‘zoo’ 
of other nuclear particles to be discussed, of which the 
proton, neutron, and pion are only the most basic and rel- 
evant. Finally, at the next level there is the Standard Model 
of quarks, leptons and force-mediating particles. The fig- 
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ure provides a bird’s eye view of the path of science that 
brought us ever deeper into matter, and that path is what 
we are going to run through fast in this section, and explore 
in more detail in the remainder of this chapter. 


The periodic table: atoms. Around 1900 the chemical 
elements were considered the basic building blocks of all 
matter. Neatly catalogued by the Russian chemist Dmitri 
Mendeleev in the periodic table that he proposed in 1869, 
the table that has decorated most high-school chemistry- 
classrooms ever since. These elements are the smallest 
entities carrying well-defined chemical properties, and as 
such are indeed the basic building blocks of all of chem- 
istry. The strict order present in the periodic system hinted 
at an underlying organizing principle that — as we know 
now — is nothing but the atomic structure with a nucleus 
in the center and electrons ‘orbiting’ around it, the struc- 
ture that was uncovered by Ernest Rutherford in 1908 and 
was so successfully described by the new quantum the- 
ory. As we explained in the introduction, quantum physics 
basically entered our thinking at the atomic level. In this 
part of the book, and this chapter in particular, we will — as 
advertised — go down from the atomic level to the physics 
of the nuclei and the underlying structure of elementary 
particles. 


Matter matters: nuclei. Once it was realized that the 
atoms were composite and therefore not truly fundamen- 
tal, physics turned to the study of the atomic nuclei, which 
led us to a picture on even smaller scales where we distin- 
guished the proton and neutron as the building blocks of 
the nuclei, and of course the electron needed to complete 
the atoms. And to understand the binding of protons and 
neutrons in the nucleus a relatively light particle type was 
identified, the pion, that was assumed to be the carrier of 
the strong nuclear force. It was assumed to play the same 
role as the photon did for the electromagnetic interactions. 
Furthermore, it was discovered that the free neutron was 
in fact unstable; through the so-called {-decay process it 
would decay into a proton, an electron and another funda- 


mental particle that had to be postulated to save energy 
and momentum conservation. This elusive particle was 
called the neutrino, a remarkable particle with somewhat 
ghostlike properties in that it was for a long time believed 
to have neither mass nor charge, and therefore extremely 
hard to detect directly. 


What doesn’t meet the eye: the nuclear particle zoo. 


If | could remember the names of all these parti- 
cles, I'd be a botanist. 
Enrico Fermi 


During the 1960s and 1970s experiments demonstrated 
the existence of an ever-growing list of so-called elemen- 
tary nuclear particles which was referred to as the particle 
Zoo, a term expressing a mild form of despair. Instead of 
bringing the number of fundamental building blocks back to 
an ever smaller number, that number seemed to grow with- 
out limit. All of these nuclear particles were called hadrons. 
One class consisted of fancy brothers and sisters of the 
proton and neutron, collectively denoted as baryons. A 
second class contained a large number of relatives of the 
pions, and those were called mesons. Feynman, in one 
of his popular lectures, quipped that the business of par- 
ticle physics basically boiled down to a fancy equivalent 
of smashing watches into a wall, in an attempt to find out 
what was in them and how they worked. 


Law and order regained: the eightfold way. All these 
new baryons and mesons turned out to be composite as 
well. It was quite a mess until Murray Gell-Mann (and inde- 
pendently George Zweig) in 1964 created order by apply- 
ing a beautiful symmetry principle, which Gell-Mann called 
the eightfold way. This term added a spiritual dimension to 
elementary particle physics as it alluded to the teachings 
of Buddha, in particular a fragment from the first sermon 
after his enlightenment, which reads: 


And what, monks, is the middle path, by which 


A SPLENDID RACE TO THE BOTTOM 


151 


CHEMISTRY 


(a) Anno 1900. Mendeleev’s iconic periodic table of 
the chemical elements. The elements are ordered by 
increasing atomic mass in subsequent lines, and the 
columns give elements which have similar chemical 
properties. This structure is a direct consequence of 
applying quantum mechanics to the atom. 


NUCLEAR PHYSICS 


Force carriers 


(b) Anno 1950. The building 
blocks of nuclei are the proton 
and the neutron, and together 
with the electron they make the 
atoms. While the electromagnetic 
force is carried by the photon de- 
noted as y, the strong nuclear 
force was believed to be carried 
by particles called the pion, de- 
noted by 74 and mo. The neu- 
trino had to be included to ac- 
count for nuclear B—decay. 


STANDAARD MODEL 


(c) Anno 2000. The constituent 
and force particles of the Standard 
Model. The quarks and leptons 
form three families of constituent 
particles of which only the top row 
is stable and used to make ordinary 
matter of the sort listed in the pe- 
riodic table. Notice that the Higgs 
particle has a special place in the 
scheme of things. 


Figure 1.4.2: Three levels of ‘simplicity. Three successive levels of reductionism spanning a century of quantum physics. The basic 
building blocks (a) of chemistry, (b) of nuclear physics and (c) of subnuclear particle physics. The atomic nuclei are built from protons 
p and neutrons n which each consist of three quarks, with p = (wud) and n = (udd) . So the first element hydrogen 'H for example 
has a nucleus consisting of a single proton, while the the second element helium *He has a nucleus made up of two protons and two 


neutrons. 


the one who has thus come has gained enlighten- 


ment, 
which produces knowledge and insight, 


and leads to peace, wisdom, enlightenment, and 


nirvana? 

This is the noble eightfold way, namely, 
right understanding, right intention, 

right speech, right action, right livelinood, 
right attention, right concentration, 

and right meditation. 


Budda, sermon 


The eight ‘rights’ mentioned correspond to the corners of 
the octagon that fits in the big wheel, as shown in Fig- 


ure 1.4.3. 


The ‘eightfold way’ à la Gell-Mann is based on a mathe- 
matical group of symmetries known as SU(3) .! Now, this 


1SU(3) is the group of rotations in three-dimensional complex 
space. Indeed, there is one sentence that always applies to quantum 
whatever: things become complex! If not in the real sense than at least 
in the mathematical sense. Numbers, parameters, functions, spaces, 
transformations, all of it turns complex when you go quantum! You need 
a tolerance for ‘complexification’ to avoid quantum allergy. 
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elegant scheme served not only as a meticulous book- 
keeping device, and just like Mendeleev’s system the eight- 
fold way also made definite predictions for the existence of 
certain particle types that were discovered subsequently. 
More importantly, however, was that the SU(3) structure 
hinted at the existence of yet a new layer of fundamental 
particles. Particles from which all known types in the parti- 
cle zoo could be assembled. Gell-Mann coined the name 
quarks for these new basic building blocks, referring to a — 
by now famous — quote from the novel Finnegans Wake by 
James Joyce: 


Three quarks for Muster Mark! 
Sure he hasn't got much of a bark 
And sure any he has it’s all beside the mark. 
But O, Wreneagle Almighty, wouldn’t un be a sky of a lark 
To see that old buzzard whooping about for uns shirt in the dark 
And he hunting round for uns speckled trousers around by 
Palmerstown Park? 
Hohohoho, moulty Mark! 
James Joyce, Finnegans Wake 


The pronunciation of this elusive particle’s name is ‘quork’ 
rather than ‘quark’, which presumably is the one intended 
by Joyce as it rhymes with Mark and bark. Irish friends 
| trust have explained to me that the first exclamation is 
paraphrasing a typical order in a pub: ‘Three quarts (of 
beer) for Mister Mark!’ In German ‘quark’ refers to a dairy 
product, and one would interpret it like: ‘Three quarks for 
Master Mark!’ It probably is no accident that Gell-Mann in 
his later life turned to the study of linguistics and in partic- 
ular to phonetics, in an attempt to trace back the evolution 
of languages and in some sense reconstruct the ‘mother’ 
of all languages. He always had an exceptional fascination 
and talent for language, as he spoke about twenty of them, 
and | remember him always taking extreme care to make 
sure he pronounced the rather unpronounceable names 
of — in my case, Dutch — colleagues like ‘Gerard ’t Hooft or 
‘Peter van Nieuwenhuizen’ perfectly, followed by an exege- 
sis of its meaning and origins! 


Figure 1.4.3: The eightfold way. In Buddism the ‘eightfold way’ 
refers to a very basic principle that brings the eight primary 
teachings together. It was unfolded in Budda’s first sermon after 
his enlightenment. Presumably it was the symmetric geometry 
of the above ‘wheel of wisdom’ that must have suggested the 
term to Gell-Mann. 


To be or not to be: quarks. According to this scheme 
the quarks carried a new quantum number which is nowa- 
days called flavor. In the original theory there were three 
‘flavors, up, down and strange, denoted by the letters u, 
d and s. Later on additional flavors were discovered — 
charm, top and bottom, denoted by c, t and b — to make 
a total of six. This would mean that the symmetry group 
would be the much larger group SU(6) . The fact is that the 
last three quark types are much heavier particles and very 
unstable, so they do not play a prominent role in ‘ordinary’ 
physics. The physicists say that the SU(6) flavor symmetry 
is ‘broken’ to the much smaller Gell-Mann SU(3) . 


The nucleons (and in fact all baryons) consist of three 
quarks: the proton for example corresponded to (uud) 
and the neutron to (ddu) . The mesons like the pion would 
consist of quark anti-quark pairs. From this assignment it 
is not hard to see that these quarks have to carry fractional 
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electric charges: you have two equations for two charges 
qu and qq, and if you solve them you find that qu = 2e/3 
and qa = —e/3. 


Splendid unification: the Standard Model. After these 
new basic building blocks were postulated, it took almost 
another decade before the real theory for the binding of 
quarks into the other nuclear particles was developed and 
the idea of quarks really caught on. What kept the quark 
idea from general acceptance was the question of ‘to be or 
not to be, in the sense that these elusive quarks were not 
observed as free particles. With their fractional charges 
they would have been easy to identify. For some reason 
they apparently could not be knocked out of the protons or 
neutrons. They lived in peaceful coexistence with their ‘not 
being there’ so to say. Later we understood that this con- 
finement or imprisonment property of quarks was a con- 
sequence of the nature of the so-called ‘color-force’ be- 
tween them. This new fundamental, strong nuclear force 
between quarks indeed exhibited the desired feature that 
it imprisons the quarks in threesomes (the baryons) or in 
quark-antiquark pairs (the mesons). 


It was not until the 1970s that a slow paradigm shift car- 
ried us to the Standard Model of quarks and leptons and 
of the particles that mediate three of the four known funda- 
mental forces. This Standard Model has in the meantime 
been confirmed in impressive detail by a large number of 
experiments performed at the major particle accelerators 
all over the world. 


Fatal attraction: forces yield structure 


A description of nature does not stop with the inventory 
of building blocks or basic constituents. One also likes to 
know why the building blocks stick together the way they 
do. What we need to know in other words are the forces 
between the constituents, and how they act. Because it is 


Figure 1.4.4: Gravity at work. The solar system with its seven 
(in fact nine) planets moving in bounded elliptic orbits around 
the Sun. (Source: Getty images) 


through interactions between constituents that new struc- 
tures emerge. This is an all-important ingredient of build- 
ing models of the world at any level, and we will start with a 
pedestrian expose, which will deepen along the way in the 
book. Attractive forces acting between particles may lead 
to the formation of bound states between the constituents 
and thus to the formation of structure. Bound systems are 
only stable when the attractive force is balanced by a repul- 
sive force at small distances. The phenomenon of gravita- 
tional binding in Newtonian physics is most familiar. Here 
we show that our naive classical intuitions fail when talk- 
ing about the atomic binding of electrons to nuclei caused 
by the electromagnetic force, and of the nuclear binding 
of protons and neutrons in the nucleus. We got stuck but 
quantum mechanics came in to rescue us. 


The Earth orbits the Sun in a slightly elliptic orbit: this bind- 
ing is caused by the attractive gravitational force as de- 
scribed in the section about Newtonian mechanics. The 
first question is: why don’t we drop into the Sun as the 
force is attractive all the way in? The force-law corre- 
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sponds to an infinitely deep gravitational potential well; and 
so why does the Earth not fall down? Deep wells may pro- 
voke deep thoughts. The reason that the Earth doesn’t 
drop in is that it has a tangential velocity, and that veloc- 
ity induces a outward directed so-called centrifugal force 
that balances the gravitational attraction. More precisely, 
that tangential velocity component implies that the Earth 
— Sun system has a certain non-vanishing angular mo- 
mentum, because as you remember L = x x p and it is 
the p—component perpendicular to x that matters. New- 
ton’s dynamical laws decree that angular momentum is 
conserved, basically because the force is directed to the 
Sun, i.e. in the radial direction, and therefore that force 
cannot change the tangential component of the velocity.” 
The expression for the energy of a particle in the gravita- 
tional field can be written as: 


2 L? M 
E(r) = P + 5 ENE . 
2m mr r 


(1.4.1) 


The first term contains the radial motion, while the tan- 
gential components give rise to the second term, where 
L is the magnitude of the angular momentum that is a 
fixed number for each orbiting planet. The last term is the 
Newtonian gravitational potential. We note that the second 
term is positive and acts as a repulsive term for decreasing 
r, while the last is attractive. We have depicted them sep- 
arately, as well as their sum in Figure 1.4.5. The resulting 
purple curve has a minimum that corresponds to a situ- 
ation where the radius is fixed and the motion is circular, 
and the velocity entirely tangential (p, = 0). 


Turned the other way around, one may ask what would 
happen if we put the Earth at rest at a certain distance 
from the Sun and let go, then clearly a disaster would be in- 
evitable as the Earth would drop straight into the Sun. 


There is something far beyond the scope of our present exposé 
to worry about, however, if we include Einstein’s relativity the system 
would start to radiate gravitationally, which means that the bound sys- 
tem would lose energy and therefore in the end would collapse anyway. 
This effect of energy loss due to radiation has been observed in spec- 
tacular detail in a certain double (neutron) star systems. 


ici | | | | 

Figure 1.4.5: Balancing attraction and repulsion. The radial po- 
tential U(r) correspons to the last two terms of equation (1.4.1) 
and represents a central force field which drops off as the in- 
verse square of the radius. The terms are drawn separately as 
well as their sum for a particular choice of the parameters. The 
shape of the potential with a repulsive and an attractive part is 
universal in situations where we have both bound states with 
E < 0, and scattering states with E > 0. The E = 0 case repre- 
sents the parabolic orbit. In the minimal energy (orange dot) the 
radius is fixed and the motion is circular. If the energy is higher, 
the orbit can be elliptical (yellow dashed line) with two turning 
points at different radii. 


With L = 0 there is no angular momentum barrier to save 
the system from collapse! The potential would correspond 
to the blue curve in the figure. Well this presents us with a 
puzzle from a principle point of view, if we take the naive 
approach and consider the idealized situation where we 
treat the Earth and Sun as point particles. Then the Earth 
while approaching the Sun would feel an ever stronger at- 
tractive force giving the Earth an ever-growing acceleration 
and speed! And by falling in, the Earth would gain an ‘in- 
finite’ amount of energy, and as its speed would be limited 
by the speed of light it would acquire an unlimited amount 
of mass. 


What actually happens in such radial approaches may cer- 
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tainly be violent as we know from falling meteorites hitting 
us from time to time, but due to the finite size of the objects 
colliding the acceleration towards the center is stopped 
and the kinetic energy is converted into structural damage, 
debris flying around, and heat. 


Infinities call for new physics. What these examples 
teach us is that other, non-gravitational physics takes over 
and saves the day. This is often the cavalier way physi- 
cists wave their hands about the singularities in their theo- 
ries that keep pestering them, and that most non-physicist 
audiences are most curious about. It must be said that 
the physicists have been unreasonably successful with this 
pragmatic approach. As far as we know nature /s non- 
singular, and the moment it threatens to become singular, 
it usually amounts to a wake-up call to go and search for 
new physics and new theories that avoid the singularities 
and thereby save serious science from demise. 


This is exemplified by applying the gravitational force to 
the different case of a radially collapsing star. If we look at 
an extremely massive star, the gravitational attraction di- 
rected to the center and is kept in balance by the repulsive 
force, caused by the outwardly directed pressure gener- 
ated by the nuclear burning processes in its core. How- 
ever, as we'll discuss later on, the amount of nuclear fuel 
is finite and even a massive star will one day stop shin- 
ing, after which a gravitational collapse to some compact 
object is unavoidable. Depending on the mass of the orig- 
inal star, this final state can be a white dwarf, a neutron 
star or a black hole. In the first two cases a new repul- 
sive force working at smaller inter-particle distances halts 
the collapse and allows for a new balance thereby avoiding 
the singularity. The most dramatic possibility is the forma- 
tion of a black hole. But a black hole is surrounded by a 
horizon that keeps us from knowing what happens to the 
mass inside and whether there is anything singular going 
on. A horizon seems to save the day, or better the hori- 
zon masks our ignorance about what precisely is going 
on! Putting things behind the horizon sounds like the sci- 


entific equivalent of sweeping things under the carpet. Yet, 
that is apparently the way in which nature prefers to keep 
some of its secrets. This property is referred to as Cosmic 
censorship. 


In the previous chapter we mentioned the direction in which 
progress is made to handle this problem. It is again by 
shifting the attention from the singularity in the origin to a 
deeper quantum mechanical understanding of what a hori- 
zon really is. In principle black holes come in all sizes and 
a Planck-mass black hole would have a horizon as well, 
and could therefore be considered as the ‘hydrogen atom’ 
of quantum gravity. We just don’t know yet how this works 
precisely, as we have no fully consistent quantum theory 
of the gravitational force. But taking the essential idea of 
Hawking radiation from the horizon as a guiding principle, 
black holes would be unstable states of matter, bound to 
somehow evaporate completely. And that would turn the 
embarrassment of its singularity in some kind of red her- 
ring. For the moment however, black holes remain in the 
category of ‘unsolved problems’. 


The quantum stability of matter. In the case of colliding 
ordinary objects it is the much stronger electric force that 
keeps the balance, and prohibits the infinite energy gain of 
two point particles colliding gravitationally. But what if we 
have two point particles with opposite charges, say a posi- 
tively charged proton and a negatively charged electron, 
which make up the familiar hydrogen atom? Now both 
forces are attractive, and yes there can again be an an- 
gular momentum barrier, or better a repulsive core due to 
the angular momentum that dominates over the attraction 
for small distances. But what about the lowest state where 
the angular momentum would be zero. 


Classically the same 1/r singularity — as it is called — would 
certainly rear its head again, and maybe you would ex- 
pect a mini-blackhole to form. No, this is certainly not what 
happens, and yes, there is other physics — quantum phys- 
ics to be precise — that saves the day. The lowest quantum 
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state with zero angular momentum turns out to be perfectly 
stable and well behaved. It has a wavefunction that corre- 
sponds to a spherically symmetric probability distribution 
for the electron to be at a finite distance from the nuclear 
core. It is one of those ‘life saving’ manifestations of the 
Heisenberg uncertainty relation. This relation does not al- 
low a quantum particle to just sit at the bottom of a ‘quan- 
tum bowl’; being at rest and completely localized is a no- 
go. Heisenberg prohibits a particle from falling down to the 
origin. This result is all-important because what it means is 
that quantum theory guarantees atomic stability. Stability 
means that the energy of a quantum system is somehow 
bounded from below. The atom can radiate away electro- 
magnetic energy by emitting photons until it gets down to a 
lowest angular momentum and lowest energy state which 
is perfectly regular and stable. 


Having made this victorious claim | should sit back for a 
moment and scratch my head. What about an atom with 
more than one electron? Just take any. Would this atom 
not decay into a state where all electrons descend to their 
lowest possible, so-called, ground state, one may ask? 
Certainly if we ignore the electric forces between elec- 
trons. But is this what we see happening? 


The answer to this well-posed question is a fully-fledged 
‘No’! We see that different atoms behave quite differently 
from a chemical point of view, and that fact is at the root 
of all diversity in nature. How could that ever be if all elec- 
trons would be sitting in the same state? This disturbing 
shortcoming of naive quantum theory is resolved by an ad- 
ditional — at first sight magical — quantessential principle, 
that prohibits particles like electrons to occupy the same 
state! When Wolfgang Pauli introduced this exclusion prin- 
ciple it was certainly a rather ad hoc rule, a veritable deus 
ex machina. But it did in one blow bring theory back into 
excellent agreement with the observations. According to 
this principle you should think of electrons a bit like people 
at a pop festival in desperate need of a toilet. The simple 
truth is that a ‘seat’ is either free or occupied and there is 


no in-between; if occupied, you have to go and look for the 
nearest free seat, which may be way out. Electrons are 
permanently involved in playing some game like ‘musical 
chairs. A notable aspect of this mutual exclusion is that it 
only concerns exclusion of the same type of particles, not 
particles of a different type. Moreover not all particle types 
are subject to the exclusion principle. The particles which 
are like electrons are called fermions, while the particles 
that are not, like the photon, are called bosons. We will 
return to this topic in a forthcoming section. First we turn 
to a more detailed description of the atom. 


Atomic structure 


One of the early icons of quantum theory is the Bohr model 
of the atom that we discussed in the previous chapter. It 
makes it clear in a transparent way how a rather simple 
but radical idea that can be directly implemented leads to 
a very non-classical behavior, explaining qualitatively the 
physics we are observing. This heuristic device was then 
turned into a mathematical precise framework by Heisen- 
berg, Schrödinger, Dirac, Born and many others. This 
work revealed a complete set of quantum numbers label- 
ing the states including the spin of the electrons. To com- 
plete the model of the atom Pauli’s exclusion principle also 
had to be invoked The study of the atom taught us what 
quantization really means, and at the same time raised the 
intricate epistemological questions that haunted the theory 
and its practitioners for almost a century thereafter. 


The Bohr atom: energy quantization 


In the subsection on the Bohr-radius on page 134 of the 
the previous chapter, we introduced the Bohr model for 
the atom with its characteristic quantized orbits depicted in 
Figure 1.3.7, and its quantized energy levels. In this sec- 
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tion we want to look at these quantized energy levels and 
point out how they are related to the observed discrete 
line spectra of light emitted by atoms. The connection that 
Bohr established was that if an atom makes a transition 
from some excited state to a lower one, it would emit a 
photon with a frequency given by the Planck — Einstein re- 
lation, so AE = hv. Conversely, an atom could absorb a 
photon if its frequency matched the energy for an electron 
to move up. The schematic of such processes is given in 
Figure 1.4.6, where it is also indicated that for the hydrogen 
atom the transitions to the ground states have frequencies 
that correspond to the ultraviolet, while the transitions to 
n = 3 correspond to the infrared end of the spectrum. So 
only the transitions to the n = 2 levels are in the visible 
domain. Clearly having a simple model that could account 
for these discrete line spectra was a major success for the 
early quantum physicists. These line spectra can be con- 
sidered as an atomic barcode, if you hand it to me | can 
tell you which atom you were looking at. 


The Schrodinger atom: three numbers 


After the Bohr model was introduced in 1913, it would 
take another thirteen years until Schrédinger and Heisen- 
berg published their fundamental equations for quantum 
physics. The first called the theory wave mechanics and 
the second matrix mechanics, but in fact they were fully 
equivalent descriptions of the quantum states and their 
observables, as was later shown by Dirac. The Schrdédin- 
ger equation is a wave equation in three dimensions, that 
could be solved exactly for simple atoms and that yielded 
the full spectrum of atomic states with all its quantum num- 
bers. It went much further than the Bohr model, but to a 
certain extent it incorporated the same simple idea in a full 
three-dimensional model for the atom. In the Schrödinger 
picture the states correspond to wavefunctions w(x) that 
are defined over all of the position space, x € ¥ = RÌ. 
And from the wavefunction of a state the related proba- 


Figure 1.4.6: The origin of light. If the electron makes a transi- 
tion between the energy levels, the fixed energy difference AE 
translates into the photon frequency; AE = hv. This determines 
the color of the lines of the spectrum, which can be observed in 
absorption (left) or in emission (right). 


bility distribution of where to find the electron can be de- 
rived. 


The equation: a guided tour. I 
So, let me step back and try to give you an idea what the 
Schrödinger equation is about, and what it looks like. Let 
us call it a ‘guided tour’ In an operational, maybe even 
opportunistic, sense, going from classical to quantum me- 
chanics, is mathematically speaking not that hard. Once 
you accept that momentum is represented as a spatial 
derivative operator, p = —iħ V , and the energy or Hamil- 
tonian as a time derivative H = ihd/dt, one can translate 
the classical functions into corresponding quantum opera- 
tors or equations just by substitution. For example: 

$ d h 


2 
P . 2 
E= ih = 4. 
> +V(x) 7 ay 5 V~+ V(x), (1.4.2) 


on the left we have the Newtonian expression of the energy 
and on the right we have the Schrödinger ‘wave opera- 
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tor’, which when we let it work on a (wave) function p(x, t) 
yields the Schrödinger equation in all its glory: 


A 


. dY(x,t 2 
ih = SV +V(x)) Yx t). (14.3) 


For now we don’t want to get too deep into the mathemat- 
ics of the equation but let us at least make some observa- 
tions which are not so hard to digest: 

(i) The equation expresses a simple truth, namely that the 
energy (operator) generates the time evolution of the sys- 
tem. 

(ii) Quantum states are described by wavefunctions ¥ that 
satisfy this equation. 

(iii) The wavefunction is complex meaning that it has a 
phase factor in it, and it describes a probability amplitude. 
(iv) Squaring the amplitude gives the probability density 
p(x, t) for finding the electron in a small volume element 
dx around the position x and at a time t. We defined 
p(x, t) = |¥/*, so that the overall phase of the amplitude 
drops out. It doesn’t affect the probability, which is where 
the physics is. 

(v) Indeed, the notion of probability apparently enters al- 
ready on this basic level in the theory, where we are still 
talking about the state of a single particle. 


The quantization. Of great importance are the so-called 
stationary states, meaning that the physical properties do 
not change in time. You would think that the wavefunction 
has to be time independent in that case but that is not quite 
true. What is true is that the time dependence has to sit 
in the phase factor b(t) which is going to drop out anyway 
in the probability density. The answer is to write Y as a 
product of a phase factor which depends on t only, and a 
time independent wavefunction (x) that describes a time 
independent stationary state. We write 

Wx, t) = piba Se =" a(x), (1.4.4) 
and substitute it in the Schrödinger equation. If you take 
the derivative, you get that the time dependence drops out 


completely and you are left with a nice time independent 
equation for p(x): 
h? 

(- zV? + V(x) Y) = E(x). 
m 
where E is the constant energy value of the stationary state 
w. The crucial point here is that you first have to solve 
the equation to find out which values of E make quantum 
sense. It turns out that only specific values give a solu- 
tion for which the square of w gives an acceptable prob- 
ability function. This means that the solutions have to be 
square integrable; the integral over all of space of the ab- 
solute square of the function has to be finite. This inte- 
gral can then be normalized to one to obtain an appropri- 
ate probability density. This type of mathematical problem 
is called an eigenvalue problem; the values E that occur 
in equation (1.4.3) are called eigenvalues and the corre- 
sponding functions w(x) are called eigenfunctions. This 
really is the stage at which the quantization ‘takes place’ in 
the Schrödinger approach, and the eigenvalues are often 
called quantum numbers. Hopefully this helps you to also 
imagine what people mean when they talk about ‘quan- 
tizing’ some (classical) system. They perform the substi- 
tutions as we did in equation (1.4.2) and then look for the 
eigenvalues and the corresponding eigenfunctions charac- 
terizing the quantum states of the system. E 


(1.4.5) 


A free quantum particle. Let us consider the simple case 
where V(x) = 0 that corresponds to a free particle. The 
solutions are periodic plane waves: 


Wu(x) x e™*, 


The meaning of the vector k (which appears here as a 
vector of free parameters defining the solution) becomes 
clear if we substitute the solution in equation (1.4.5) with 
V = 0 , which yields: 


(1.4.6) 


R|k/ 
k — 


R (1.4.7) 


This is just the classical expression for the kinetic energy 
once we use the fact that the momentum is given by p = 
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Particle in box: 0 <-2%).%9<-L-n;= 3, ng =4 


Figure 1.4.7: Quantum particle in a box. A state of a two-dimen- 

sional quantum particle in a box of length L. The wavefunctions 

w(x) have to vanish on the boundary, and are of the form 
Pnn: ~ sin(nyx17/L) sin(ngx27/L) . 

We have plotted the wavefunction w and corresponding proba- 

bility density p for finding the particle corresponding to quantum 

numbers nı = 3 and n2 = 4. 


hk. There is an annoying technical complication here, if 
you calculate the probability density for the particle, you 
find p(x) = hb = 1, which is unacceptable because it 
cannot be normalized to ‘1’. If you take the integral over 
over a constant non-zero probability density then you would 
find the total probability to be be infinite! The way out is to 
put the particle in a box, say a cube of size L, so that 
the wavefunctions have to vanish on the boundary where 
xi = L. This in turn means that the momentum values be- 
come quantized: pi = Aki = 7hn;/L with integer-valued 
ni . The space of admissible momenta corresponds there- 
fore to an infinite three-dimensional cubic lattice, where the 
energy levels grow as the length of the momentum vector 


squared: En ~ n?. 


In Figure 1.4.7 we have depicted a particular solution for a 
two-dimensional particle in a box, where the wavefunctions 


Figure 1.4.8: Spherical harmonics. The spherical coordinates 
r, 0, and ọ of the (yellow) point x are defined on the left. On the 
right the angular distribution p1,m)(®,@) = |Y mj? , for a state 
with quantum numbers l = 3, m = 1 is plotted. 


that satisfy the boundary conditions are of the form: 


Prin (x) = Nsin(mxn/L)sin(nzxz7/L), (1.4.8) 


with N a normalization factor. In the figure we plotted 
the wavefunction and the corresponding probability den- 
sity function for the case nı = 3, nz = 4. Note that this 
wavefunction describes a one-particle state, but that that 
particle has a rather outspoken preference for certain po- 
sitions which sit on a periodic lattice inside the box. We 
will in Volume II go much deeper into what this probability 
interpretation of the wavefunction exactly means. For ex- 
ample, looking at the figure the obvious question: ‘Where 
is the particle?’ begs for an answer. As it turns out that 
answer is far from obvious! 


The hydrogen atom. Let us return to the question of what 
the states look like for an atom. With the nucleus in the 
origin the electron moves in the spherical Coulomb field 
caused by the positive charge of the nucleus. The poten- 
tial has a rotational symmetry, which means that it is ad- 
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Spherical harmonics 


Y|"(0,0) 


Figure 1.4.9: What the quantum states of hydrogen look like. 
The angular dependencies of the hydrogen wavefunctions cor- 
responding to the so-called spherical harmonics Yı ™(0, ọ) 
where n, l and m are the discrete quantum numbers which la- 
bel the state. Each state can hold at most two electrons, one 
with spin up and one with spin pointing down. The first quantum 
number n labels the energy level corresponding with a triangle 
in the Figure. At each level we have states where angular mo- 
mentum l runs from 0 to n — 1 along the vertical axis and for 
each l the component m runs horizontally from —1 to l. 


vantageous to rewrite the Schrédinger equation in terms of 
a radial (r) coordinate and two angular coordinates (09, ~) 
(see Figure 1.4.8). The equation then basically separates 
into three independent equations depending again on cer- 
tain discrete quantum numbers. The radial quantum num- 
ber n = 1,2,... linked to energy level is basically the or- 
bital quantum number introduced by Bohr. And the en- 
ergy eigenvalues E we just discussed are quantized like 
E ~ 1/n?. The angular dependence of the states intro- 
duces two more quantum numbers: l = 0,...,m — 1 and 
—l < m < +l, both of which are related to angular mo- 
mentum. The wavefunctions corresponding to the states 
are usually written like Wnim(T, 89, @) = Rni(t) Yim(9, ©) 
where the radial and the angular dependences are sep- 
arated. In Figure 1.4.9 we have depicted the angular de- 


Figure 1.4.10: Charge distributions. Light color indicates high 
probability. The charge or electron probability distribution in the 
xz—plane shows the © and r dependence. Depicted are the 
n = 4,1 = 3 states, with m = 0,...,3. These states corre- 
spond to the states on the bottom line of the previous figure. 
The shapes of the probability distributions are all-important for 
understanding the chemical binding properties. 


pendencies by plotting (the real part) of the functions Yı ™ 
for all admissable l and m values up to principle quantum 
number n = 4. 


Degeneracies. It turns out that the states Wrim(x) are 
highly degenerate, meaning that different states will have 
the same energy. For every energy level (labeled by the 
quantum number n) there are a total of 2n? different angu- 
lar momentum states which all have the same energy. In 
Figure 1.4.9 these degenerate states correspond to the an- 
gular (l, m)-states within each triangle labeled by n. The 
extra factor two comes from the two possible electron spin 
states that will be discussed shortly. Plotting this discrete 
spectrum one would get a three-dimensional discrete lat- 
tice filling a triangular pyramid (or is it a nicely decorated 
Christmas tree?). These degeneracies are not acciden- 
tal: they are the consequence of certain symmetries in 
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this problem. These symmetries lead to certain conserved 
quantities and these in turn lead to degeneracies in the en- 
ergy spectrum. We will return in more detail to this topic in 
Chapter II.6. 


Lifting degeneracies. These degeneracies corresponded 
exactly to the observations that the Dutch physicist Pieter 
Zeeman made some 25 years earlier (almost simultane- 
ous with Planck’s quantization hypothesis). He discovered 
that by turning on a magnetic field the degeneracy of the 
different angular momentum eigenstates was lifted, which 
is reflected in the splitting of the spectral lines correspond- 
ing to one energy level into many different lines. So you 
could count the multiplicity of the degeneracies. For the 
discovery of this ‘Zeeman splitting’ he shared the Nobel 
prize with Hendrik Antoon Lorentz in 1902. 


With the Bohr model in mind it is intuitively not too hard 
to interpret these splittings. Clearly Bohr had only used 
circular orbits but if we think of negatively charged elec- 
trons orbiting the positively charged nucleus, these would 
create a magnetic moment like a circular electric current 
would do. This magnetic moment would be proportional to 
the angular momentum of the electron state. What caused 
the Zeeman effect was that the different magnetic moment 
or angular momentum states would acquire an extra en- 
ergy contribution from the interaction of that moment with 
the external magnetic field. And interpreted this way, his 
measurements showed direct evidence for the quantiza- 
tion of the component of the angular momentum along the 
magnetic field in integer multiples have quantized values 
mh, where for given l there was naturally the restriction 
—l < m < +l. Needless to say that none of these quan- 
tization rules can be understood from a classical point of 
view. 


This splitting, which could be completely accounted for 
within the framework of the Schrödinger or Heisenberg 
equation, is called the normal Zeeman effect. However, 
Zeeman did actually discover an additional quantessential 


feature in the spectra, which is referred to as the anoma- 
lous Zeeman effect, to which we turn next. 


The discovery of spin 


The Pauli principle was published early in 1925. 
.... Well, | had introduced those quantum numbers 
but, if | had been a good physicist, then | would 
have noticed already in May 1925 that this implied 
that the electron possessed spin. But | was not a 
good physicist and thus | did not realize this... Then 
Uhlenbeck appears on the scene ... he asked all 
those questions | had never asked ... When the 
day came that | had to tell Uhlenbeck about the 
Pauli principle — of course using my own quantum 
numbers — then he said to me: ‘But don’t you see 
what this implies? It means that there is a fourth 
degree of freedom for the electron. It means that 
the electron has a spin, that it rotates’... | asked 
him: ‘What is a degree of freedom?’ In any case, 
when he made his remark, it was luck that | knew 
all these things about the spectra, and | said: ‘That 
fits precisely in our hydrogen scheme which we 
wrote about four weeks ago. If one now allows the 
electron to be magnetic with the appropriate mag- 
netic moment, then one can understand all those 
complicated Zeeman - effects.’ 

Samuel Goudsmit (1971) 


As announced, there was another quantessential treasure 
hidden in Zeeman’s spectral data that caused a great deal 
of confusion among the early quantum physicists. It is 
known as the anomalous Zeeman effect, and was observed 
in the spectrum of Sodium, where a line in the absence of 
an external magnetic field already appeared split: this is 
because of the coupling between the spin and orbit mag- 
netic moments. When Zeeman turned on the field, he 
found further splittings in an even number of lines as indi- 
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Figure 1.4.11: The discovery of spin (and the qubit). This 
shows the anomalous Zeeman effect discovered in 1898, the 
same year that Planck introduced his constant. The line spec- 
trum of Sodium, corresponding to the arrows in the figure, was 
split due to the spin of 1/2 of the electron. Without the mag 
netic field, the level l split up into levels with total spin j = 1445 
(on the left). Turning on a weak magnetic field he observed the 
hyperfine structure because the spin degeneracy was lifted. 


cated in Figure 1.4.11. These splittings implied that quan- 
tum theory would somehow also admit half-integral values 
for the angular momentum (as 2j + 1 is only even if j is 
half-integral). 


In 1925 the resolution was proposed by two young Dutch- 
men, Samuel Goudsmit and George Uhlenbeck who were 
still graduate students at Leiden University. They came 
up with the bold proposal that the electron was spinning 
around its axis and that that ‘spin’ would account for the 
so-called hyperfine splittings observed in the anomalous 
Zeeman effect. It may remind you of the good old solar 
system with the Earth rotating about its axis while orbiting 
the Sun! The idea implied that the observations had noth- 
ing to do with an extra feature of the atom as a whole, but 
rather with a totally new feature of the electron itself. 


Behind the scenes. Wolfgang Pauli 
had already come across this prob- 
lem in 1925 and had understood that 
the quantum numbers of atomic states 
were basically related to the radial and 
the angular motions of the electron. Indeed, three 
dimensions gave rise to three quantum numbers: 
n = 1,2,... for the radial direction, and l = 
0,1,...,n—land—l < m < +l for the angular mo- 
tions. But he also noted that to get things right he 
needed a fourth quantum number which he some- 
what desperately called Zweideutigkeit, meaning 
something like ‘double valuedness’. The story of 
how the all-important discovery of spin unfolded is 
a kind of amusing, but for the young researchers 
involved in fact rather traumatic. 

Goudsmit and Uhlenbeck, discussed their spin-idea 
with their Leiden advisor Paul Ehrenfest, who liked it 
and proposed that they should write it up. They did 
so and showed their work to the grand old Leiden 
professor Lorentz who had earlier developed a so- 
phisticated theory of the electron, but entirely within 
the classical framework. A thing he could do well 
was to calculate the rotational speed the electron 
would need to have in order to produce the mag- 
netic moment corresponding to (1/2)h, and that 
turned out to exceed the speed of light by orders 
of magnitude. This in clear contradiction with the 
theory of relativity. Understandably, this argument 
knocked down the the confidence of the students 
and they went back to Ehrenfest to humbly with- 
draw their paper that contained this incredible stu- 
pidity. Alas, it turned out that Ehrenfest had already 
submitted the paper, and didn’t seem to take it too 
seriously, making the consoling remark: ‘Sie beiden 
sind jung genug sich eine Dummheit leisten zu kön- 
nen.’ (‘The two of you are still young enough that 
you can afford yourself such a stupidity’). Actually it 
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seems that Bohr when he heard about the proposal 
liked it and Einstein also apparently judged it rather 
mildly. 

It actually turned out that somewhat before that 
time, a young American physicist, Ralph de Laer 
Kronig, had also thought of the electron spin (amus- 
ingly, | took my first quantum mechanics course with 
Kronig in Delft in 1966). To his misfortune, he hap- 
pened to show the idea to Wolfgang Pauli, who in- 
stantly demolished it, so that Kronig ended up not 
working on it any further. The issue of who should 
and who should not be credited with the discov- 
ery/invention of spin remains hidden in darkness. 
It is this strange story with a touch of tragedy that 
may explain why a Nobel prize was never awarded 
for the discovery of the quantessential property of 
spin as an intrinsic property of particles. It also 
shows that the advice of even the greatest ‘advi- 
sors’ should sometimes be taken with the neces- 
sary pinch of salt. LI 


The electron possessed a new property called spin! It 
could only exist in a spinning state with intrinsic angular 
momentum values s = 1/2 in units of A. In Figure 1.4.11 
we show how this conjecture did in a rather spectacular 
way resolve the special properties of those particular ‘D- 
lines’ in the spectrum of Sodium. The idea was to think 
of a new total angular momentum quantum number de- 
noted j = 1+1/2, basically expressing that the spin would 
either be aligned or anti-aligned with the orbital angular 
momentum. In that case the component along the field 
of j denoted as j3 could run from —j < ją < +j. Hence 
the 2j + 1 energy levels of the right-hand side of Figure 
1.4.11 is an even number since j = 1/2 and 3/2 respec- 
tively. And that does the job if you assumed in addition that 
the transitions could only take place if they obeyed the rule 
Aj3 = —1,0,1, that followed naturally if you took into ac- 
count that the outgoing photon itself had spin one. 


Let us conclude with a comment on the splittings of the 
energy levels. If we would have refined our model to in- 
clude the interaction of the magnetic electron spin degree 
of freedom with the magnetic moment due to the orbital 
motion of the electron, the so-called spin-obit coupling, we 
would have found the fine splitting of the left column in 
Figure 1.4.11. Furthermore if we would have included the 
interaction of the electron spin with the nuclear magnetic 
moment, we would have found the hyperfine splittings, on 
the right of the figure. 


Fermions and bosons 


There are many macroscopic phenomena that can only be 
understood from underlying quantum principles of matter. 
One of the quantum principles which has a tremendous ex- 
planatory power is Wolfgang Pauli’s exclusion principle: it 
decrees that two electrons cannot occupy the same quan- 
tum state. This exclusion property is instrumental, for ex- 
ample for understanding the atomic structure of the ele- 
ments and the magnificent chemical diversity that derives 
from it. Not all particles obey the principle though: the par- 
ticles that have half-integral spin do obey and are called 
fermions, while the particles that have integral spin do not 
and are called bosons. 


Having made a strong plea for the microscopic domain 
as the realm where the laws of quantum theory are in- 
dispensable, | should hasten to correct myself. This is 
a severe understatement. Quantum theory manifests it- 
self on all scales, but could only be discovered on the mi- 
croscopic level where it is omnipresent, manifest and in- 
escapable. Once that is recognized, however, it turns out 
that there is a host of macroscopic phenomena that can- 
not be explained without a deep understanding of quan- 
tum theory. This is so because macroscopic systems are 
made up of large numbers of microscopic quantum parti- 
cles. One might expect that there are particular proper- 
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Figure 1.4.12: The exclusion principle. The exclusion principle 
applied in the game called musical chairs. (Source: wikiHow) 


ties of the microscopic constituents, which are specifically 
quantum mechanical and have a strong bearing on inter- 
actions between the particles, and therefore also on their 
collective behavior. Consequently there are many macro- 
scopic phenomena which are not obviously quantum, but 
nevertheless can only be understood if one takes the un- 
derlying quantum physics into account. 


Going from the microscopic to the macroscopic domain 
does not necessarily erase all quantum traces. A striking 
example is the property of spin and the exclusion principle 
of Pauli that — as we mentioned — decrees on the quantum 
level that particles with half-integral spin cannot occupy 
the same quantum state. We will have much more to say 
about this in Chapter II.5 of Volume II, but for the moment 
we will state the basic facts about it. Whereas the photon 
is a boson, the electron is a fermion and so are the pro- 
ton and neutron. So, fermions don’t like each other, they 
like to claim territory and chase away intruders, and they 
not only try but have to avoid each other. In spite of hav- 
ing no genes they certainly come across as rather selfish! 
Fermions are permanently involved in playing a kind of mu- 


Figure 1.4.13: The struggle to unravel structure. Mendeleev’s 
periodic table of chemical elements in a historical perspective. 
The dark red color indicates the elements which were known 
already in antiquity. Adding the light pink entries you arrive at 
Mendeleev’s table. Including the blue colored elements brings 
us up to 1945 (Seaborg’s table) and the yellow elements were 
discovered after that. Many entries are thus post-Mendelevian. 
(Source: Sandbi - Wikimedia Commons) 


sical chairs (see Figure 1.4.12). For bosons the behavior is 
the opposite, if the system is at very low temperature and 
there is no energy to excite the bosons, they love to join 
each other, and all settle in the same ground state. They 
will form what is called a condensate, a Bose-condensate 
to be specific. These are macroscopically coherent collec- 
tive quantum states which may exhibit spectacular proper- 
ties. This form of quantum coherence manifests itself in 
for example a laser beam, but also in phenomena like su- 
perfluidity and superconductivity. We will return to these 
properties in Chapter III.3 of Volume III. 
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Figure 1.4.14: The left step Janet periodic table. This is a logical representation of the periodic table in direct correspondence with the 
quantum classification of atomic states, as depicted in Figure 1.4.9. Starting from the right each block corresponds to an increasing 
integer value of the angular momentum as is indicated in the left column, where letters are used with | = 0 © s,1 © p,2 © 
d, 3 & f . And each block contains 2(21 + 1) states, i.e. 2,6, 10, 14, .... In comparison with the standard Mendeleev representation of 
Figure |.4.2(a), one sees that the extra rows added (of the Lanthanides and the Actinides) at the bottom of the Mendeleev table get a 


natural place as l = 3 blocks in the Janet table. 


Atoms: the building blocks of chemistry 


The connection between Mendeleev’s periodic table of the 
chemical elements depicted in Figure 1.4.13, and the sys- 
tematics of atomic states of Figure 1.4.9 is not immediate. 
This is so because the Mendeleev table was conceived 
prior to quantum mechanics. This begs for alternative rep- 
resentations of the periodic table in which the underlying 
quantum structure is manifest. 


The rich structure of the periodic table of atomic elements 
underlying all the structural diversity of chemistry is a direct 
consequence of the fermionic nature of the electron. Be- 
cause in an atom with more than a single electron, not all 
the electrons can sit in the lowest possible state, they have 
to successively fill the higher energy states of for example 
Figure 1.4.6. And as the chemical properties of the ele- 


ments are mostly determined by the outer electrons, they 
will be different because the charge distributions associ- 
ated with the states of the outer electrons may have quite 
different shapes, as we saw in Figure 1.4.9. 


With our knowledge of quantum theory we might prefer 
to draw the periodic table differently, for example following 
Janet as we did in Figure 1.4.14. In that non-standard vi- 
sualization there is a direct correspondence with the way 
the atomic quantum states are labeled as we depicted in 
Figure 1.4.9. The quantum states at a given energy En are 
organized into shells labeled by the angular momentum 
quantum number l that runs from 0 ton — 1. Quantum 
theory tells us how things work on the microscopic scale 
but as a consequence thereof it leaves indelible marks on 
much larger scales in fields like chemistry and material sci- 
ence. We will have to say a lot more on chemistry and 
condensed matter in Volume III. 
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Nuclear structure 


Nuclei consist of a certain number of protons and neu- 
trons that are kept together by the strong nuclear force. 
Nuclei that occur in nature are relatively stable for the ex- 
cellent reason that they wouldn't be there otherwise, but 
not all nuclei we find in nature are stable. There are many 
metastable isotopes that can decay in a variety of ways, 
either by the emission of protons, neutrons, «x particles, or 
by B= ory radiation. Many of these occur spontaneously 
in nature and have important applications, for example in 
the context of carbon dating. Short-lived B* radiators are 
for example used as radioactive tracers for PET scanning 
purposes. 


An atom consists of a positively charged massive nucleus 
in the core and a number of electrons ‘orbiting’ around it, 
making the overall charge of the atom zero. The natural 
next step in the quest for fundamental building blocks was 
to proceed to the structure of the nucleus itself. As always 
in science, if one observes regularities in structure, one 
tries to figure out an underlying mechanism that explains 
those regularities. Here it was not different. The ques- 
tion was open ended in the early days of quantum the- 
ory, and it might have happened that one entered a realm 
where even quantum theory would fail. How exciting! But 
alas, that didn’t happen, physics in the nuclear domain ap- 
peared to fully obey the quantum laws. The mechanism 
underlying nuclear binding is similar to that of the atom in 
some aspects, but different in others. 


Nucleons: Protons and neutrons. Nuclear fission exper- 
iments demonstrated that nuclei are composed of particles 
called nucleons, of two types, the proton or the neutron. 
From Table B.4 at the end of the book about the discov- 
eries of fundamental particles, we learn that the neutron 
was discovered by James Chadwick as late as 1932, for 
which he received the Physics Nobel prize in 1935. but 
remarkably we also learn nothing about the discovery of 


Coulomb repulsion 


Figure 1.4.15: The nuclear potential between protons as a func- 
tion of their distance. The potential is given by the purple curve, 
which is the sum of a long-range electromagnetic repulsion (in 
red) and a short-range attractive part due to the strong nuclear 
force ( in blue). Once the particles get close enough they are 
strongly bound. 


the proton as such. That discovery was implicitly made 
with the discovery of the atomic nucleus by Rutherford in 
1911, where the proton is defined as the nucleus of the 
simplest atom, hydrogen. Rutherford, the great physicist 
and chemist from New Zealand who spent most of his ac- 
tive research years in Canada and Britain, is often called 
the ‘father of nuclear physics. He was awarded the No- 
bel prize for Chemistry in 1908, ‘for his investigations into 
the disintegration of the elements, and the chemistry of 
radioactive substances. The neutron was discovered rel- 
atively late, presumably because it is unstable as a free 
particle: it decays under emission of an electron (and an 
invisible (anti-)neutrino) into a proton! This process was 
at the root of all radioactive B decay processes of nu- 
clei, discovered by Henri Becquerel as early as 1896 and 
dramatically expanded by Marie and Pierre Curie. If nu- 
clei are made of protons and neutrons, the first question 
that comes to mind is: how can positively charged pro- 
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tons stick together so closely in a tiny nucleus if they all 
carry the same positive charge? Equal charges repel and 
repel more strongly if they get closer to each other, be- 
cause the Coulomb force is inversely proportional to their 
squared distance. So, how come nuclei don’t fly apart? 
What keeps them together? 


A looming crisis leading to a considerable number of gifted 
Desperado’s in search of new physics! A simple but bold 


idea would be to bluntly postulate a new strong ‘nuclear’ 


force that would be stronger than the electromagnetic force 
so that it could overcome the electromagnetic repulsion 
and cause a net attraction. If we in addition assume that 
this strong nuclear force works equally strongly on protons 
and neutrons, this could in principle explain the nuclear 
binding. And indeed, that is the way it worked out! 


The picture looks like Figure 1.4.15, where we have plot- 
ted the interaction energy of two protons as a function of 
their distance. It is important to note that there are two 
contributions, one from the electromagnetic repulsion (the 
red curve), which is long range and typically falls inversely 
proportional to the distance, and one from the attractive 
nuclear force (the blue curve), which is strong but acts 
only over a short range. These two contributions add up to 
the interaction energy corresponding to the purple curve 
where one sees that the repulsion dominates for large dis- 
tances. Compare these curves for the nuclear binding 
energy with those we gave for the atomic binding in Fig- 
ure 1.4.5, where the ingredients are similar but work out 
very differently; in the atomic case the attraction dominates 
the long distance behavior. 


Of course also the instability of neutrons had to be in- 
cluded into this picture as well, and that involves postu- 
lating yet another force, the so-called weak nuclear force, 
which will be discussed on page 196. 


Isotopes and nuclear decay modes 


Isotopes are nuclei that differ from their standard stable 
composition by having more or less neutrons. This means 
that these are metastable under various forms of emission. 
Some are short-lived, and some are long-lived. Nuclear 
isotopes have important applications. 


Isotopes. Nuclei are characterized by two labels, one is 
the atomic number (basically the nuclear charge in units of 
the elementary charge e) and the other is the mass num- 
ber. These labels can be easily converted into the number 
of protons, np, and the number of neutrons, n,, in the 
nucleus, as follows 


(1.4.9) 
(1.4.10) 


atomic number = np 


mass number = n, + Nn 


The basic question was to understand the stability of the 
well-known atomic nuclei corresponding to the chemical 
elements. It turned out to be a matter of striking balances. 
For a chemical element the atomic number in the peri- 
odic table is clearly identified with the number of protons 
in the nucleus. In principle one would expect that, given 
the electric charge (~ np), there could be different num- 
bers of neutrons and therefore one could expect different 
atomic weights for a given element. This is indeed the case 
and we speak of different isotopes of the element, where 
the atomic number is the same but the mass number dif- 
fers. As their charge configuration would be the same, their 
chemistry would also be, because that is basically gov- 
erned by the electronic states around the nucleus. Well- 
known examples of isotopes are deuterium and tritium, the 
heavy forms of hydrogen. In addition to the proton, they 
have one and two neutrons respectively, and are therefore 
often denoted as 7H and °H as to distinguish them from 
ordinary hydrogen, H ='H. 


Another important isotope is the carbon isotope '*C, to 
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Figure 1.4.16: Stable and unstable isotopes. The array of nu- 
cleotides or nuclear isotopes with on the horizontal axis the num- 
ber of neutrons and on the vertical one the number of protons. 
The narrow black band in the middle of the colored region marks 
the stable nuclei. The other colors refer to nuclear decay types 
explained in the next figure. 


be distinguished from the stable isotope '*C . The former 
occurs naturally but is unstable, due to the decay-process 
into nitrogen-14, 

MCs N++e +y, (1.4.11) 
where it emits an electron and an anti-neutrino. This de- 
cay is very slow with a half-life t1/2 of 5730 years. It is this 
slow decay that is put to use in carbon-14 dating meth- 
ods to determine the age of sediments, fossils and antique 
art objects. How nice, a nuclear instability that renders 
an important service to society, as it helps to unambigu- 
ously separate real from fake when it comes to providing 
quantitative, archeological, historical and anthropological 
evidence about the age of objects. 


In Figure 1.4.16 we display the array of isotopes, with the 
number of neutrons on the horizontal axis and the number 
of protons (i.e. atomic number) on the vertical one. The 


Figure 1.4.17: Nuclear decay modes. The basic decay modes 
of nuclei correspond to moves in the diagram: B~ decay cor- 
responds to the emission of an electron, B® to the emission of 
a positron. «-radiation corresponds to the emission of a *He 
nucleus consisting of two protons and two neutrons. 


stable nuclei form the black curve through the center of 
the colored band, below and above is a band of unstable 
nuclei that may or may not occur in nature. Note that the 
line of stable elements is below the ny = ny line, which 
indicates that ever more neutrons are needed to stabilize 
the nucleus with increasing charge. The line of stable ele- 
ments ends, indicating that beyond a certain atomic num- 
ber all isotopes become unstable (around n, = 82). 


Nuclear decay modes. At any point in the chart of iso- 
topes there are a number of conceivable instabilities cor- 
responding to moving to neighboring spots as indicated in 
Figure 1.4.17. The nearest neighbors, found by moving, 
down or sideways in the chart, correspond to adding or 
getting rid of a single neutron or proton. But we may also 
think of other so-called transmutation modes; for example 
the nucleus may emit «-radiation, which just means that it 
emits a (stable) *He nucleus consisting of two protons and 
two neutrons. In our diagram this implies that the nucleus 
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moves two steps to the left and two steps down. Another 
possibility is that a neutron in the nucleus decays by B7 
decay into a proton and emits an electron (and an anti- 
neutrino) in which case we move one step up and one to 
the left. This is because the net charge increases by one 
unit, meaning that the nucleus would move one step up in 
our chart, but at the same time it moves one step to the left 
as the number of neutrons is decreased by one. 


For each isotope the dominant decay mode is color coded 
in the chart of Figure 1.4.16, and as expected the dominant 
decay tends to move the isotope to the black line of stable 
elements. 


The chart shows that away from the stable nuclei marked 
as black, we have a rather broad band of metastable nu- 
cleotides or isotopes, but that band is bounded. On the 
very right of the table we get into a region where the would- 
be elements have no stable isotopes at all. These are com- 
pounds that do not occur in nature. But that didn’t keep 
physicists like Glenn Seaborg at Berkeley from cooking 
them up in the lab. And as you see the nuclear physicists 
have filled out the periodic table up to an atomic number 
of about 120 by now. The new elements carry legendary 
names like Einsteinium, Curium, Bohrium, and so on. An 
ironic footnote is, that, while named after scientists whose 
names may well live forever, the corresponding elements 
themselves are only extremely short-lived. 


Half-lives count. We mentioned already in passing the 
quantessential notion of a half-life or a decay time usually 
denoted as t and it may deserve some explanation. If we 
take for example a number of N of the of metastable '*C 
nuclei which have a certain probability to decay, then the 
number of nuclei that will decay will be proportional to N . 
This statement can easily be translated in an equation for 
the decay rate per unit time dN/dt: 


dN 
Ee, 
dt i 


The solution (see also the Math Excursion on page 612 of 


(1.4.12) 


Figure 1.4.18: Half-life versus decay time. In the case of expo- 
nential decay, the half-life tT) /2 is the time needed for half of the 
initial number No particles to have decayed. After a decay time 
T, (which chose equal 2) only No/e are left. In the figure we 
have chosen a scale No = 6 x 10!4"9¢ , 


Volume III) can be written as 


N(t) =N(O)e/*, (1.4.13) 
where N(t) is the number of '*C nuclei at time t. You see 
that the decay is exponential, and the rate equals 1/t. The 
reader may be more familiar with the notion of a half-life 
71/2, the relation is simply t,/2 = tIn2. This makes the 
decay go like 2~‘/*1/2 . so that after time t = t1/7 only half 
the number of nuclei are left. In Figure 1.4.18 this relation 
is visualized. What is remarkable about nuclear decays is 
that their half-lifes can be immense, even as big as the life- 
time of the universe! How can a microscopic mechanism 
with very short characteristic timescales like inside the nu- 
cleus produce such incredibly slow processes. Thinking 
quantum mechanically you would expect a ground state of 
a certain energy Eo to typically oscillate with a frequency 
of order v = h/Eg, and for a nucleus Ey) = 1 keV which 
yields a frequency of 10!” Hz or an oscillation time of or- 
der 10718 s . This value has to be contrasted with the decay 
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time of order 10!'! s for carbon for example. That is a factor 
of about 102! 


Imagine: you are an electron and want to get out and you 
only succeed after banging on the door 102’ times! In- 
deed it is exponentially hard to escape because there is a 
high potential barrier that wants to keep you inside, the de- 
cay is exponentially suppressed, because it proceeds via 
a process called quantum tunneling, a fully quantessential 
mechanism that has no classical analog at all. Classically 
the electron would have to climb over the mountain, but it 
has not enough energy to do that, and there is no escape 
possible. But quantum mechanically it is more subtle be- 
cause there is a ‘certain uncertainty’ in the energy of the 
particle thanks to Heisenberg. This means that a version 
of the uncertainty relations applies, reading AE- At > ħ/2. 
You may loosely paraphrase it by saying that the parti- 
cle can ‘borrow’ energy for a brief period of time. It’s like 
magic, if you do the trick fast enough nobody will notice 
and miracles are possible! Anyhow this means that there 
is a small probability that the electron will have sufficient 
energy to get away. That probability is exponentially small 
though, and depends on the height and width of the bar- 
rier. And that explains the enormous factor 10-2’ . We will 
say more about quantum tunneling in Part II of the book. 


Positron-emission tomography (PET) 


Positron-emission tomography is a medical imaging tech- 
nique for diagnostic purposes. In particular to learn about 
the functioning of organs. It makes use of specific radioac- 
tive isotopes that are injected into the patient. The scanner 
then traces how the radioactive component is transported 
through the body. 


With the use of isotopes in the medical arena one cer- 
tainly wants to reduce the exposure of patients to poten- 


tially harmful radiation and therefore the isotopes needed 
for this purpose are typically short-lived positron (*) emit- 
ters. So here it is anti-matter that matters! If a positron is 
emitted, it will run into an electron in the detector, and to- 
gether they will annihilated into a pair of high-energy pho- 
tons that move out back-to-back. These photons get de- 
tected and from their momenta one can reconstruct where 
the positron was located. 


The suitable radio isotopes are thus to be found in the or- 
ange region under the black line of stable nuclei in Fig- 
ure 1.4.16. Typical isotopes with short half-lives are carbon- 
11 (T1/2 ~ 20 min), nitrogen-13 (t)/2 ~ 10 min), oxygen-15 
(T1/2 ~ 2 min), or fluorine-18 (t)/2 ~ 110 min). These so- 
called tracers are added to compounds the body uses nor- 
mally, such as sugars, water and sometimes just the air we 
breath (oxygen-15). 


Transmutation: Fission and fusion 


Nuclei aren’t good or bad, it’s what people do with 
them we have to worry about. 


We discuss the basics of nuclear fission and fusion pro- 
cesses, emphasizing their peaceful applications. This in- 
cludes the large global initiative, ITER, to construct a work- 
ing net energy producing fusion reactor. 


In Figure 1.4.19 we show the binding energy per nucleon 
as a function of atomic mass number. The natural ten- 
dency is to minimize the energy: the system will mini- 
mize its total binding energy assuming there are no unsur- 
mountable energy barriers that block access to that min- 
imal energy configuration. The graph clearly shows the 
remarkable and important fact that elements of low mass 
number tend to lower their binding energy through fusion 
into heavier nuclei, whereas on the other side we see that 
at high mass number, nuclei can lower their binding energy 
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Figure 1.4.19: Fusion and fission. The binding energy per nu- 
cleon inside a nucleus as a function of its mass number. Favored 
processes are those where this binding energy per nucleon de- 
creases. The light elements tend to fuse, while the heavy ones 
tend to break up. 


by decaying or fission into lighter nuclei. Note also that in- 
terestingly the elements He, '*C and '°O are relatively 
stable. In the following subsections we will focus first on 
fission and then on fusion. 


Fission. 


The fundamental point in fabricating a chain react- 
ing machine is of course to see to it that each fis- 
sion produces a certain number of neutrons and 
some of these neutrons will again produce fission. 

Enrico Fermi 


The heavy elements on the right of Figure 1.4.19 with a 
high binding energy per nucleon are typically unstable with 
respect to decay or fission processes. In these processes 
the total mass number (np + nn) has to be conserved. We 
start with fission because it was easier to achieve than fu- 
sion — not only in reactors, but also in rather singularly dra- 


Coulomb repulsion 


Figure 1.4.20: Fission of uranium, By absorbing a neutron the 
235U isotope changes to the unstable uranium isotope 7°°U that 
splits into a '*'Ba and a ’*Kr nucleus plus three neutrons. 


matic experiments — like the making of nuclear bombs. In 
applications, whether it is in the deplorable nuclear weapon 
industry, or in fission reactors, or in hospitals, one always 
needs nuclei that are ‘fissible’. ‘Good fissibility’ means that 
their fission after absorbing a neutron will also produce, 
apart from the heavy fission products, additional neutrons 
that can then destabilize neighboring nuclei. This way 
one can start a chain reaction. And clearly if that is not 
extremely well-controlled it will turn into an exponentially 
growing decay process, a meltdown or nuclear explosion, 
depending on the circumstances. History bears witness to 
quite a few of such cataclysmic events, and nuclear safety 
and disarmament should remain a primary concern for all 
of us. We have to find a responsible balance between prof- 
itability and safety and the price is high if we don’t get it 
right. 


In Figure 1.4.20 we have illustrated the fission of a uranium- 
235 nucleus after the absorption of a neutron into the nu- 
clei of barium-141 and krypton-91 plus three neutrons. The 
nuclear process is given by the following reaction equa- 
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tion:? 


in 3U > Ba +2Kr+3ln. (1.4.14) 
It is clear that the emitted neutrons can ignite other ura- 
nium nuclei and this will keep the process going, provided 
that there is a sufficient concentration of uranium-235. 


Natural uranium is found in ore deposits in many places 
around the world. It is predominantly a mixture of the two 
isotopes uranium 238 (99.27%) and uranium 235 (0.72%), 
and therefore to make nuclear fuel that can be used in re- 
actors, one has to increase the 235 fraction by an ‘enrich- 
ing’ process, for example by using centrifuges to get rid 
of the heavier 238 isotope. In a fission reactor the pro- 
cess is moderated by neutron absorbing materials such as 
graphite, water or heavy water (where the hydrogen is re- 
placed by deuterium). The uranium-235 itself has a natural 
half-lifetime of 703,800,000 years, so no wonder there is 
still a lot left from the original amount stocked in the Earth 
crust. It naturally decays by emitting an « particle, pro- 
ducing a thorium-231 which in turn then decays rapidly in 
protactinium-231 and so on. It winds up in a long chain 
of successive reactions of which some are fast and oth- 
ers slow, with half-lives of thousands of years. The reac- 
tion chain of uranium-235 ends with the element lead-207 
(785Pb), which is stable. However, if we get the uranium- 
235 to absorb a neutron, it turns into a uranium-236, and 
that is unstable so it breaks up in krypton and Barium plus 
three neutrons, and that can keep a chain reaction go- 


ing. 


Fusion. Going back to the binding energy curve of Fig- 
ure 1.4.19 we now turn to the left side, where we see that 
energy can be gained if we manage to fuse light nuclei 
(like hydrogen) into a stable nucleus with higher atomic 
number (like helium-4). This is not so simple because one 
has to ‘overcome’ the electromagnetic Coulomb repulsion 


3We use the notation 4X with X= chemical element, A= mass num- 
ber and Z= atomic number. 


between for example two protons. Now in an accelerator 
this certainly could be done but to do this on a larger scale 
one has to achieve physical conditions which are quite ex- 
treme. So, to get fusion going has turned out to be very, 
very difficult. In spite of numerous experts who have been 
raising expectations, the timescales for achieving fusion 
have been repeatedly extended by decades. To go from 
‘scientific feasibility’ to ‘successful technology’ sometimes 
takes a long time and may be hard to estimate. This leads 
to the familiar situation where either the optimists or the 
pessimists are ridiculed! 


The Lawson criterion. How extreme the conditions are 
that have to be met in order to get fusion to work can 
be 


Chrysopoeia: transmutation into 


gold? 


There was a lot more to magic, as Harry 
quickly found out, than waving your wand 
and saying a few funny words. 
J.K. Rowling, Harry Potter and the Philosopher's Stone. 


Making gold is the alchemist’s dream! In alchemy, 
the term chrysopoeia means transmutation into 
gold. It comes from the Greek words xpyooo, 
khrusos, meaning ‘gold; and ztotetv, poiéin, 
meaning ‘to make.’ The term refers to the creation 
of the stone of wisdom or the philosopher's stone. 
In the early days of alchemy, in Egypt and Greece 
there was a serious quest for the stone, as it 
would allow you to turn any metal into gold. It 
apparently led to a kind of primordial gold rush. For 
example, Zosima’s formula of the crab, supposedly 
constituted a kind of recipe to brew gold out of 
copper and zinc. If only copper, zinc and a Bunsen 
burner would do! This ‘recipe’ would instantly turn 
any ‘nitwit’ into a billionaire, for as long as they 
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managed to keep it secret of course! In Egyptian 
antiquity there must have been loads of books 
on alchemy, and — from a historical point of view 
— unfortunately, almost all of them have been 
lost. It was an ‘executive order’ by the Roman 
Emperor Diocletian in AD 296, which decreed 
that all alchemy books on making gold had to be 
burned. Anyway, we all know that the true heirs of 
alchemy are of course our friends the stockbrokers. 
Or should | say the Silicon Valley tech-billionaires 
who turn doom scrolling addictions into gold! 


Now you ask, can nuclear physics revive the old 
gold-plated dream in a more mundane way? The 
answer is a clear ‘yes!’ Gold was first synthesized 
from mercury by neutron bombardment in 1941, but 
the isotopes of gold produced were all radioactive, 
so the gold produced had an expiration date, and 
that is precisely what you don’t want. You don’t 
want a fragile ‘bread and butter’ like commodity to 
be your gold standard. Actually there is only one 
stable gold isotope, 17 Au , so to produce desirable 
gold, nuclear reactions must create this isotope. 


It can be done, but unfortunately it is way more 
expensive than just buying gold. Gold can actu- 
ally be manufactured in a nuclear reactor by irra- 
diation of mercury with neutrons. For this to work 
you need the mercury isotope Hg, which oc- 
curs with a frequency of 0.15% in natural mer- 
cury. That isotope can be converted into gold, by 
first absorbing a neutron and then through electron 
capture decaying into 17 Au with some slow neu- 
trons. | think we can be sure that, all those painfully 
negotiated and maintained nuclear nonproliferation 
agreements are not made out of fear that bad peo- 
ple might embark on breeding a nuclear goose pro- 
ducing golden eggs ad infinitum. L 


1.0073 amu W 
1.0073 w 


w Q 


Am = 0.0266 amu 
=> AE = Am c? = 3.969 x 10"? Joule 


Figure 1.4.21: Energy gain by fusion. The net energy gain 
from a fusion of four protons into a He nucleus, as it hap- 
pens in the Sun. One atomic mass unit or amu corresponds 
to 931.5 MeV/c? = 1.661 x 10777 kg © 1.49210710 joule. 


expressed by the so-called Lawson criterion. John Law- 
son, a young engineer working on nuclear fusion, decided 
in 1955 to work out exactly how hard it is to achieve fu- 
sion. Although his colleagues were quite optimistic about 
their prospects, he wanted to prove it to himself. He found 
that the conditions for fusion power relied on three vital 
factors. By calculating the requirements for more energy 
to be created in the plasma than is put in, he came up with 
a dependence on three quantities: temperature (T), den- 
sity (n) and confinement time (t). He derived a lower 
bound on the triple product, L = ntT which would de- 
pend on the type of process and the type of machine. 
For the deuterium-tritium fusion one typically needs L > 
10?! keV s/m? and that is what the international fusion 
project ITER in France is expected to achieve. The techno- 
logical promise of a fusion reactor based on the Tokamak 
concept, where an extremely hot nuclear plasma is con- 
fined to a toroidal reaction chamber by very strong mag- 
netic fields, has been clearly established. So far no stable, 
net energy producing fusion device has been constructed, 
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Figure 1.4.22: Fusion in the Sun. This is the chain of nuclear 
fusion processes that takes place in the core of the Sun. It is 
a three step process, leading from protons via deuterium and 
helium-3 to the stable helium-4 nuclei. The net result is that four 
protons are converted into a single helium-4 nucleus. 


but we will discuss the ITER project shortly. It is a sober- 
ing thought that it is not us who invented fusion, of course 
nature did. And we have learned a lot from studying and 
understanding the energy production in the Sun which ba- 
sically is a gigantic nuclear fusion reactor. 


Let the Sun shine. 


... No more falsehoods or derisions 
Golden living dreams of visions 
Mystic crystal revelation 

And the mind’s true liberation ... 

Let the sun shine, let the sun shine in! 


The fifth dimension in the musical Hair (1967) 


The extreme pressure caused by the gravitational force in 
the core of stars turns them into extreme pressure cookers, 
allowing for all kinds of fusion processes to take place. Ev- 
ery second, our Sun turns 600 million tons of hydrogen into 


4 Now 


“A 


Billions of years 


Figure 1.4.23: The life cycle of the Sun. An average star like 
the Sun has sufficient hydrogen to burn by fusion so as to keep 
shining for about 10 billion years. It will then form a red giant 
after which the core collapses to a white dwarf about the size of 
the Earth. 


helium, releasing an enormous amount of energy. Achiev- 
ing fusion on Earth has required a different approach since 
we lack a natural pressure cooker to achieve the densi- 
ties and temperatures needed. The temperature at the 
Sun’s surface is 6,000 degrees, while at its core it is 15 
million degrees. Temperature combines with density in 
the Sun’s core to create the conditions necessary for the 
fusion reaction to occur. The gravitational forces of our 
stars cannot be recreated here on Earth, and much higher 
temperatures are necessary in the laboratory to compen- 
sate. 


The basic process of burning hydrogen to produce helium 
through the chain of fusion processes is depicted in Fig- 
ure 1.4.22. The hydrogen nuclei are just protons, so, in 
the first step we make deuterium under emission of a neu- 
trino and a positron. The second step is to have the deu- 
terium and a proton fuse into helium-3 under emission of a 
photon. Finally two helium nuclei can fuse into the stable 
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Helium-4 


Deuterium 


Figure 1.4.24: The basic ITER process. The basic fusion pro- 
cess: *H +3H —34He + n is delivering the energy in the ITER 
fusion reactor. The difference in mass between the total incom- 


ing and out-coming nuclei is converted into energy according to 


Einstein’s formula E = mc2. 


helium-4 nucleus under emission of two protons. The net 
energy delivered in such a process is what is calculated in 
Figure 1.4.21: it amounts to about 4.0 x 10-1? joules per 
helium-4 nucleus produced. In other words, burning 1 kg 
of fuel this way would produce about 2.3 x 10” MWh. This 
is comparable to what a 100 MW energy plant produces 
in 26 years! 


Having analyzed the energy production of the Sun, we 
have also answered the question whether the Sun will keep 
shining forever. The answer is a firm ‘no’, because the Sun 
will simply run out of fuel at some point. The long-term per- 
spective for life on Earth looks quite dim. In some 5 billion 
years the Sun will first blow up to form a red giant that 
will swallow the inner planets (including the Earth). The 
core will then collapse to a compact stellar object called a 
white dwarf while the outer parts will be blown off in space. 
The life cycle of the Sun is schematically depicted in Fig- 
ure 1.4.23. So, beware: our days are counted! 


Figure 1.4.25: ITER. The international fusion reactor located 
in France. The reaction chamber contains the plasma which is 
enclosed in a toroidal magnetic field configuration, where it is 
heated up to temperatures of a few hundred million degrees so 
that fusion can take place. (Source: ITER) 


ITER: the nuclear fusion reactor 


ITER will be the first fusion device to produce net energy 
and it will be the first fusion device to maintain fusion for 
long periods of time. Futhermore, it will be the first fusion 
device to test the integrated technologies, materials, and 
physics regimes necessary to enable a commercial pro- 
duction of fusion-based electricity. 


The ITER project comprises a truly global collaboration, 
where China, the European Union, India, Japan, Korea, 
Russia and the United States are now engaged in a 35- 
year project to build and operate the ITER experimental 
device. The goal of the program is to produce a net gain of 
energy and deliver a prototype for the fusion power plant 
of the future. It has been designed to produce 500 MW of 
output power for 50 MW of input power — or ten times the 
amount of energy put in. The current record for released 
fusion power is 16 MW (held by the European JET facil- 
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ity located in Culham in the UK). In the ITER Tokamak, 
where the plasma is confined by strong magnetic fields 
into a toroidal reaction chamber, temperatures will reach 
150 million degrees, that is ten times the temperature at 
the core of our Sun! 


The 180-hectare ITER site in Saint Paul-Les-Durance, in 
the south of France, has a 42-hectare platform the size of 
60 soccer fields. Building began in August 2010. The hope 
is that the reactor will be completed around 2030. 


Field theory: particle species and forces 


A major achievement in quantum physics in the second 
half of the twentieth century is the development of quan- 
tum field theory (QFT). It is a general formalism that en- 
compasses both quantum theory and special relativity, and 
dramatically shifted our perspective on what particles deep 
down really are. It made us understand the origins of spin, 
and of the exclusion principle and its related particle statis- 
tics properties. These developments culminated in the 
Standard Model which comprises precise and explicit new 
theories that describe the strong and weak nuclear interac- 
tions, as well as electrodynamics. Quantum theory opened 
the door to the microcosmos, and quantum field theory 
appears to correctly describe all processes down to the 
smallest scales we have been able to probe so far. 


Our quest to understand nature at ever smaller scales, 
forced us to study elementary processes at ever higher 
momenta and energies. This is a direct consequence of 
Heisenbergs uncertainty relations. Making Ax small re- 
quires making Ap and thus p and E large. To achieve such 
extreme energies one had to build big particle accelera- 
tors like CERN near Geneva and Fermilab near Chicago. 
Imagine, the energy consumption of one such machine is 
comparable to that of a medium-size city! 


When the energies become of the same order as the rest 
masses of the elementary particles involved, one neces- 
sarily has to take special relativity into account. In partic- 
ular, in view of the equivalence of mass and energy, we 
have to anticipate processes occurring where energy will 
be converted into mass and the other way around. On the 
one hand we expect the production of massive particles 
out of pure energy, and on the other hand the creation of 
pure radiation energy out of particle anti-particle annihi- 
lation. To make further progress in understanding these 
processes a theoretical framework that is consistent with 
both quantum mechanical and (special) relativistic princi- 
ples was needed. The problem was in fact twofold: one 
was to find the relativistic generalization(s) of the Schrö- 
dinger equation, and the other was to develop a formalism 
for many particle states, where particles could be created 
and annihilated and converted into pure energy in the form 
of photons for example. Implementing these two require- 
ments together gave rise to the (relativistic) quantum field 
theory formalism. 


Relativistic wave equations. Let me recall that the clas- 
sical Maxwell theory is already relativistically invariant. In 
fact, it was electromagnetism that pointed Einstein the way 
to relativity because it was hidden in there. You could say 
that the Maxwell equations are relativistic but not really 
quantum yet. With the Schrödinger equation the problem 
is the other way around, it is quantum but not relativistic. 
It is not, because it is based on the Newtonian — there- 
fore non-relativistic — definitions of energy and momentum. 
We looked at the basics of the Schrédinger equation in 
a previous section and constructed the Schrödinger wave 
equation by means of a substitution where we replaced 
the classical E and p variables with differential operators, 
as shown in equation (1.4.2). 


That exercise showed that the Schrédinger equation is not 
relativistically invariant. It is a wave equation, but quite 
different from the electromagnetic wave equation (1.1.47), 
which features the relativistic wave or ‘box’ operator, we 
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introduced in equation (1.1.31). Indeed, in the electromag- 
netic wave equation discussed in Chapter 1.1, space and 
time are treated on equal footing, and that is not the case 
for the Schrödinger equation because it has a first-order 
time derivative, but second-order spatial derivatives. 


Naively following the same approach, we could start with 
the relativistic expression for the particle energy and make 
the same substitutions: 


+m’*c*)p=0. (1.4.15) 


P=pe’t+me 3 R 


Not surprisingly, we now meet again our old friend the rela- 
tivistic wave operator O , and in addition a mass term. This 
seems quite straightforward, and in fact it is. This equation 
was already written down by Schdédinger himself, who dis- 
carded it for reasons that we will point out shortly. The 
resulting equation is called the Klein — Gordon (KG) equa- 
tion, and after all the dust of field theory settled, it turned 
out to have a consistent interpretation: it describes a scalar 
particle, or a particle without spin, such as for example the 
pion. 


No ground state, no physics! On the level of a quan- 
tum equation for a single particle, interpreting the Klein — 
Gordon like the Schrédinger equation, gave rise to a real 
problem with it. Let me digress a little on what that prob- 
lem was about. If | tell you that b = 2 is true, you may 
say: ‘fine, so be it’, then | square that equation and say 
b* = 4, and again ask you what is b? Well then, if you, 
once upon a time, had dutifully executed your homework 
assignments, you would not answer b = 2, but b = +2 or 
b = —2. By squaring the equation, | have smuggled in an 
extra negative solution. | managed to somehow double the 
truth! How shrewd, the logic is impeccable but not always 
reversible. The quadratic equation is less restrictive. 


What this means is that the quadratic relation for the en- 
ergy (and the corresponding wave operator), in the KG 
equation also introduces negative energy solutions, after 
all the solutions are E = +,/p2c* + m2c*. So we do not 


add just one, but infinitely many negative energy solutions. 
Well, nothing wrong with that, if we go back to the bound 
states in the hydrogen atom. we see that also there we had 
an infinity of negative energy bound states. The significant 
difference, however, is that the negative energy values ob- 
tained from the Klein-Gordon equation are not bounded 
from below because the magnitude of the momentum is 
unlimited. In other words, there would be no ground state, 
and the particle it describes would be unstable. Unfortu- 
nately, no ground state means no physics! 


People got stuck in the Klein — Gordon theory, because 
it seemed impossible to interpret satisfactorily. And in- 
deed to do relativistic quantum mechanics correctly, one 
had to go beyond writing down a wave equation for a sin- 
gle particle. One would have to go to quantum field theory 
to resolve the apparent inconsistencies with these equa- 
tions. Nevertheless, the idea of somehow producing a 
sensible relativistically invariant first-order equation as a 
kind of ‘square root’ of the Klein — Gordon equation was 
on the table, and the hope was that that would resolve the 
problems of that equation. 


The Dirac equation: matter and anti-matter J 


Dirac was the strangest man who ever visited my 
institute. During one of Dirac’s visits | asked him 
what he was doing. He replied that he was trying 
to take the square-root of a matrix, and | thought to 
myself what a strange thing for such a brilliant man 
to be doing. Not long afterwards the proof sheets 
of his article on the equation arrived, and | saw he 
had not even told me that he had been trying to 
take the square root of the unit matrix! 

Niels Bohr 
(Quoted in Kurt Gottfried, RA.M. Dirac and the Dis- 
covery of Quantum Mechanics.) 
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The remarkable features of the electron, like having an in- 
trinsic angular momentum called spin and being subjected 
to the mysterious Pauli exclusion principle, all fell into place 
after Dirac wrote down his beautiful, relativistically invari- 
ant, first-order wave equation for the electron. But the 
biggest surprise was its prediction of the existence of anti- 
matter. 


The relativistic equation for the spin one-half electron was 
published in 1930, and Paul Adrian Maurice Dirac shared 
the Nobel prize with Erwin Schrédinger three years later. 
This equation, for the electron and its anti-particle the posi- 
tron, and the Maxwell equations describing photons, forms 
the back-bone of a theory called Quantum Electrodynam- 
ics (QED), which constituted the first example of a consis- 
tent relativistic quantum field theory. With the completion 
of this theory shortly after the Second World War, a fully 
relativistic and quantum mechanical treatment of the elec- 
tromagnetic interactions of electrons, positrons and pho- 
tons was achieved. 


On taking square roots. To get some appreciation for one 
of the most beautiful equations of physics, it is illuminating 
to go back to the Klein — Gordon equation as a starting 
point. We would like to take the positive root, so to say, of 
the Klein-Gordon, but that is hard. On the mechanics side 
on the left of (1.4.15), with the algebraic relation it is easy, 
you just take the root on both sides and only keep the pos- 
itive root by choosing* E = +./p2 + m2. But on the Klein- 
Gordon side of the story, you would have to take the root 
out of the LJ operator and that is hard to define, because 
you have to define what you mean by the square root of 
a derivative. Strictly speaking you could express it as an 
infinite series of ever higher powers of the momentum op- 
erator but that is not what you want, because that would 
involve taking ‘infinite order derivatives’ and that makes 
even strong people quail! What you really would like to 
have is an expression linear in E, p and m that squares 


“To make the argument and formulas more transparent we choose 
natural units where h = c = 1 in this subsection. 


to the Klein-Gordon operator. And that is what Dirac bril- 
liantly achieved by making use of matrices in defining this 
miraculous ‘square root’. 


A matrix root: the Weyl equation Let me take one step 
at a time and first indicate why using matrices dramati- 
cally enlarges the space of possibilities for taking a square 
root. Let us pose ‘taking the square root’ as a matrix prob- 
lem. Suppose that instead of the equation b? = 4, which 
of course has solutions b = +2, | would have considered 
the matrix equation B* = A with A being 4 times the 2 x 2 


unit matrix: 
4 0 
i S p a] 


If I ask you to solve the equation for B , then you could have 
come up with 4 independent solutions. If you start with the 
set {X"}, 


a X = (l 2 , (1.4.16) 


then 4 independent solutions would be BY = 2X". We 
may go one step further and check that the much stronger 
identity holds: 


(poX® +p- X)(poX? — p : X) = (pĝ — p?)1, (1.4.17) 


because of the special properties of the set of matrices 
{X#}. Multiplying out the left-hand side you get 44 = 16 
terms that are quadratic in both the X-matrices and the 
momentum components p,,. Equating the coefficients of 
the six different momentum combinations pupy , one ob- 
tains six equations that the matrices have to satisfy. Firstly, 
we have the condition that the symmetric products or anti- 
commutators of the matrices have to satisfy {X',x)} = 


>This calculation uses a little bit of the material out of the Math Ex- 
cursion on complex numbers at the end of Part Ill. Here it suffices to 
know that i denotes the ‘imaginary unit’ and that it by definition squares 
to minus one: i? =—1. 
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XİXİ + XİXİ = 259 . Secondly, we have the condition that 
the antisymmetric products or commutators have to sat- 
isfy [X°, Xi] = X?Xİ — XiX° = 0. And as you may check, 
the matrices X} do exactly that! This special set of ma- 
trices, are called the Pauli-matrices or spin-matrices that 
are often denoted as o,,. The reason they are called the 
spin-matrices will become clear shortly. 


So, we succeeded in writing the four-momentum squared, 
as a product of two matrices linear in the momentum, that 
is without using square roots. But this nice construction 
can be applied to equations as well. If we have a mass- 
less relativistic particle, its momentum satisfies pup" = 0, 
leading to a massless Klein-Gordon equation for a (spin 
zero) scalar field of the type Lhp(x”) = 0. However, with 
what we just learned one could also introduce a linear first 
order matrix equation. This is just the so-called Weyl equa- 
tion, named after the German mathematician, theoretical 
physicist and philosopher Hermann Weyl, who wrote this 
relativistic wave equation down in 1929: 


(iX"0,.) W(xy) =0, (1.4.18) 


where ¥ is a two-component, so-called spinor, on which 
the matrices work. The wave-like solutions are of the form: 


Wrulp)e Pe" , (1.4.19) 


with u(p) a spinor. Substituting this in the Weyl equation 
we get an algebraic equation for the two-component spinor 


u(p): 
(X-p)u(p)=0 > X-pulp)=poulp). (1.4.20) 


This is an eigenvalue equation with two independent solu- 
tions u(p) = n*(p) and eigenvalues po = Ex = +\p!: 


X- pnt =E4 nË. (1.4.21) 


This positive energy n* mode describes a massless par- 
ticle with spin one-half, with its spin polarized parallel to 
its momentum. It is a particle with a fixed positive helic- 
ity which therefore is also called a right-handed particle. 


The negative energy n` -component describes the corre- 
sponding anti-particle which necessarily has the opposite 
helicity. 

The first factor on the left-hand side of equation (1.4.17), 
also describes a two-component spinor which can be ob- 
tained from the one we just discussed by flipping the sign 
of the energy po , so it will describe a left-handed or negati- 
ve-helicity particle, and its anti-particle. 


The first thing we have to conclude is that the Weyl equa- 
tion describes a relativistic spin one-half particle. We did 
however not get rid of the negative energy solutions, but 
presumably these have to be interpreted as describing an 
anti-particle. We will return to this picture shortly. 


For a long time it was believed that neutrinos would be 
massless, left-handed particles described by a Weyl equa- 
tion, but we have in the meantime learned that neutrinos 
have a small mass after all. They therefore have to be de- 
scribed by a Dirac equation where the two chiralities get 
coupled through the mass term. 


The Dirac matrices and algebra. Dirac managed to do 
something similar for a massive particle. He started with 
the quadratic relativistic energy-momentum relation (times 
the unit matrix), and wrote it as a product of two matrix 
factors linear in the momentum. To succeed he needed 
to introduce four 4 x 4 matrix coefficients y". Using the 
standard, very convenient, ‘slash’ notation ~ = puy" (in- 
troduced by Feynman), we may write: 

(p + m1)(p — m1) = (E? — p? — m*)1. (1.4.22) 
Again, multiplying out the left-hand side out you get 4* = 
16 terms that are quadratic in both the gamma matrices 
and the momentum components. To satisfy the equation, 
the diagonal terms require (y°)* = 1 and (yt)? = —1, 
while the six terms with a product of two different momen- 
tum components should all vanish. The matrix coefficients 
correspond to the anti-commutator of the corresponding 
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gamma-matrices: 


Woy = yEy ry y". (1.4.23) 
The upshot is that conditions on the gamma matrices that 
follow from the requirement that equation (1.4.22) is satis- 
fied are summarized by the equation: 


[E y} = 21, (1.4.24) 


where nt’ = diag(1,—1,—1,—1) is the relativistic Lorentz- 
ian space-time metric we encountered before. Matrices 
satisfying an algebraic relation like the one above form a 
so-called Dirac or Clifford algebra. 


The Dirac equation. We are ready to tackle the four-com- 
ponent Dirac equation which in its most compact and ele- 
gant form can be written as: 


(ið — m1)ẹp(x") =0, 


This first-order system is relativistically invariant, because 
one can show that the matrices do indeed also transform 
like a four-vector. It has wave-like solutions multiplied by a 
four-component spinor u(p). The 4 x 4 y matrices act on 
the components of the spinor. For positive energy (E > 0) 
the solutions look like: 


p(x") ~ ulpe P", 


substituting this in the Dirac equation yields the algebraic 
equation for u(p): 


(p —m1)u(p) =0. 


The negative energy solutions can be written in a similar 
way as: 


(1.4.25) 


(1.4.26) 


(1.4.27) 


p(x!) ~ vpet Pe" (1.4.28) 
and it yields an equation for the spinor v(p): 
(p + m1)v(p)=0. (1.4.29) 


Comparing these equations we see that the Klein — Gor- 
don equation factorizes into a product of two first-order 


equations. These two equations are then combined again 
in the single four-component Dirac equation, which admits 
positive and negative energy solutions: the former corre- 
spond to the electron and the latter to the hole (or positron) 
degrees of freedom respectively. 


It is important to remark that the four components of the 
wavefunction not form a four-vector; they form a four-com- 
ponent spinor which transforms differently under Lorentz 
transformations. Another way to say this is, that of the four 
components, two states correspond to an electron with its 
two spin states, while the other two would correspond to 
a positron with its two spin states. But as the gamma- 
matrices are not diagonal the equation mixes all compo- 
nents. There is a lot of beautiful and important mathe- 
matics hidden in the Dirac equation that we will not ad- 
dress here at all. Our goal was to get to know the magnifi- 
cient equation that provided such a deep understanding of 
quantessential properties of matter like spin, the exclusion 
principle and the necessity of anti-matter. E 


The spectrum. Let us first look at the energy spectrum 
of the free Dirac particle as depicted on the left in Fig- 
ure 1.4.26. The first thing that strikes us in this picture is 
that the negative energy states have not disappeared. So, 
again it looks like there is no lowest energy state, and tak- 
ing the square root of the equation has not eliminated the 
negative energy states in any obvious way. Consequently 
one would think that this feature would make the model in- 
consistent and useless. But, no! Dirac brilliantly argued 
that because his particles necessarily have spin one-half, 
they would have to satisfy the exclusion principle. But if 
that is the case, he could decree that all negative energy 
states would be filled, and there would be no problem. 
There would be a lowest energy state for the next elec- 
tron to come in. So Pauli’s exclusion principle acts like a 
deus ex machina here. 


The second point to observe is that there is an energy-gap 
of AE = 2mc? between the highest negative energy state 
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Particle 
states 


Figure 1.4.26: The spectrum of the Dirac field. The energy 
spectrum of the Dirac equation for a spin 1/2 particle of mass 
m. It has positive and negative energy states. The negative 
energy states are all filled and form the ‘Dirac sea.’ A high- 
energy photon ( E > 2mc?) can excite an electron out of the 
sea into a positive energy particle state, and the hole that stays 
behind is just an anti-particle with opposite charge and opposite 
momentum. 


and the lowest positive energy state. In the field theory 
context this means that exciting an electron from a neg- 
ative energy state to a positive state would cost at least 
2mc? , and effectively produce both a particle and a ‘hole’. 
There is no such thing as only creating a particle. The 
‘hole’ is nothing but the anti-particle or positron, having the 
same mass and the opposite charge. So from the ‘vacuum’ 
state, corresponding to the completely filled ‘Dirac sea’ 
of negative energy states, one may create particle anti- 
particle (electron-positron) or particle-hole pairs. This is in- 
dicated on the right-hand side of the Figure 1.4.26. A bub- 
ble chamber shown in Figure 1.4.27 clearly shows the suc- 
cessive creation of two pairs from a high-energy photon. 
Understanding of the Dirac equation leads therefore in- 
evitably to the prediction and discovery of anti-matter. 


Dirac himself hoped initially that the positively charged particle 


Figure 1.4.27: Pair creation. This is bubble chamber picture of 
a high-energy photon which enters from the left, and is not visi- 
ble because it has no charge). It knocks out an electron thereby 
also creating a relatively low energy pair. Later the photon pro- 
duces a second pair with more energy. A strong magnetic field 
is applied perpendicular to the page, which causes the particle 
trajectories to curve depending on their charge and energy. A 
perfect way to split the electron and positron tracks therefore. 


Condensed matter. It is quite gratifying to see that 80 
years after Dirac wrote his equation down, it is still alive 
and kicking. This equation is there to stay! Apparently 
Dirac himself once quipped that the equation was far more 
intelligent than its author. And indeed, it has found many 
important and fundamental applications. Firstly, the equa- 
tion or variants thereof not only describe the electron, but 
in fact all elementary constituent particles like the leptons 
(electrons, muons, neutrinos etc.), and the quarks. Not 
so surprising as all of them have spin one-half and are 
fermions. Secondly, the Dirac equation and the field theo- 
retic concepts that come with it are also extremely relevant 


would correspond to the proton, so that the equation would some- 
how describe the complete hydrogen atom. It was Robert Oppen- 
heimer, then at Princeton University, who pointed out that the oppositely 
charged particles had to have the same mass and therefore the equa- 
tion implied a new species of particles, now denoted as anti-matter. 
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in condensed matter physics. This is somewhat surprising 
because a priori that is at low energy and you would not 
expect excitations to satisfy a relativistic equation. Yet, in 
a conductor the electrons fill the available energy states 
up to a level which is called the Fermi level. And there 
you have a situation which is like the Dirac vacuum, and 
indeed you can excite an electron which means that you 
effectively create an particle-hole pair. 


Majorana fermions. There are even closer analogies, 
since quite recently so-called topological phases of matter 
have been predicted in which boundary excitations occur 
that are effectively behaving like massless Dirac modes, 
thus behaving like relativistic particles. An example is the 
so-called Majorana fermion, which is a special case where 
the particle is its own antiparticle. So it has only two com- 
ponents. The theory of the Majorana fermion goes back to 
thirties of the twentieth century, to a brilliant young Italian 
physicist who proposed the model, but then mysteriously 
disappeared. In fact his disappearance has never been 
fully resolved or explained. Whereas his person remains a 
mystery, his equation fortunately does not. 


The mathematics of the Dirac operator. Finally, the no- 
tion of the Dirac operator, which is the first-order differen- 
tial operator that defines the Dirac equation, plays an im- 
portant role in pure mathematics. For example the index 
of the massless Dirac operator on smooth curved mani- 
folds is directly linked to certain topological invariants of 
that manifold, through the so-called Atiyah—Singer index 
theorem. We will return to the Dirac equation in somewhat 
more detail in the next Volume. 


Quantum Electrodynamics: QED 


Quantum Electrodynamics is the first and very successful 
example of a quantum field theory. We outline some of its 
basic structure and properties, and mention states, opera- 


tors and Feynman diagrams. This theory, starting from first 
principles, made some impressive, precise predictions that 
agreed with experiment up to 12 significant digits! 


The first milestone in relativistic field theory was the formu- 
lation of Quantum Electrodynamics (QED), a completely 
consistent quantum theory of electrons, positrons, pho- 
tons and their interactions. The theory was completed just 
after the Second World War, quite independently, by the 
American physicists Richard Feynman and Julian Schwin- 
ger, as well as the Japanese Sin-Itiro Tomonaga. They 
jointly received the 1965 Nobel prize in Physics for this 
work. This success generated further developments in 
field theory which during 1970s culminated in the formu- 
lation of the successful Standard Model of all the known 
elementary particles and the fundamental forces between 
them. 


Particles and force fields. In classical physics there is 
a clear (ontological) distinction between, on the one hand, 
constituent particles carrying mass and charge (like elec- 
trons and protons), that are often considered ‘point-like’, 
and on the other hand the force fields through which they 
interact like the electromagnetic field, and which spread 
out over all of space-time. In relativistic quantum field the- 
ory this distinction disappears. Particles correspond to ‘ 
wavefunctions’ or states of quantum fields which can be 
spread out. But the arrow goes both ways, so classical 
force fields (like the electromagnetic field) when quantized 
have particle-like excitations (like the photon). And we say 
that the forces are carried or mediated by those particles. 
Particle-wave duality is lifted to a particle-field duality at a 
higher (or should | say, deeper) level. 


So, the electron and its anti-particle the positron are de- 
scribed by a Dirac-type quantum field, as are the neutri- 
nos and the quarks. A state of the electron quantum field 
may describe any number of electrons and/or positrons. 
So, there is one field for all electrons. In fact, every par- 
ticle type has its own quantum field. But, also the force 
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fields of the strong and weak interactions have their own 
quantum fields, whose particle-like quanta we call gluons 
and W and Z bosons, respectively. In other words, there is 
a universal particle-field correspondence on the quantum 
level if we take relativity into account. Quantum field theory 
transcends the distinction between particles and forces in 
an essential way, yet the quantum fields describing con- 
stituent particles and force fields have distinctive features, 
because the constituents are fermions with spin one-half, 
and the force fields are bosons with spin one. 


States and operators. A distinguishing feature of the 
quantum field theory framework is that it allows for the 
creation and annihilation of particles. Let me try to give 
a flavor of how that works. The first important ingredient 
is the existence of a vacuum or a ground state denoted 
as the zero state |0) , that is the state without any parti- 
cle in it. The second ingredient is that quantum fields can 
be expressed in terms of particle creation and annihilation 
operators that can act on the vacuum, or any other state, 
and create or annihilate a particle in that state. A generic 
state is in fact a multi-particle state that is labeled by the 
number of particles present in the state and what their en- 
ergy, momentum and spin-polarizations are. For example 
a state, 


Iny(eH, k"), nels, p"), Np(s’, p” #)) 


would correspond to a state with ny photons in a state with 
four-momentum k! and polarization vector e" , and so on. 
The electrons and positrons have spin one-half, and their 
spin-polarization is encoded in the variables s (s’) . 


Particle creation and annihilation.’ 
The physics we want to describe involves the creation and 


annihilation of particles and this is implemented by cre- 
ation and annihilation operators we just mentioned. The 


7I have had the pleasure of running into creationists and _nihilists, 
but so far not into any annihilists. 


photon field, for example, corresponding to the vector po- 
tential A" (x, t) , has a linear expansion in photon creation 
and annihilation operators, which are denoted al (e°, k”)! 
and a(e°,k’). If the creation operator acts on a state, it 
creates a particle in the corresponding state, so for exam- 
ple: 

a (e7, q”) |0) = In (e7, q”) = 1) 


Here the creation operator acts on the vacuum and creates 
a new state in which there is one photon present (n=1), 
with the specified polarization and energy-momentum. If 
you apply the annihilation operator to the vacuum, you 
would simply get zero: 


a(e°,k”) |0) =0, 


because there is no particle to be annihilated. If there had 
been a particle with the corresponding properties in the 
state, that particle would be annihilated and we would end 
up with the vacuum state. But if we act on the vacuum 
state there is no particle to annihilate and the result is the 
number zero — the operator ‘annihilates the vacuum’ is the 
jargon. 


In general one considers rather elementary processes, with 
a few incoming particles creating an incoming state, then 
these particles interact with each other (so typically par- 
ticles will be annihilated and created), and what we want 
to know is what the possible final states are and what the 
probabilities are that they occur. To do these calculations, 
demands a lot of skill, since they tend to be extensive and 
it takes even the largest computers days to do the job. But 
the hardest part is also to set up the calculation and figure 
out in all detail which sub-processes will be there, and how 
important they are. It involves also an incredible amount of 
book keeping which of course has to be performed impec- 
cably, and one therefore has to build in all kinds of checks 
and balances to see whether the extremely rigid laws are 
completely obeyed at any stage of the computation. Ex- 
periments like those at CERN are also at the forefront of 
all kinds of Al applications, both on the data analysis side 
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Figure 1.4.28: Propagators. Particle propagation lines or prop- 
agators for various particle types. The arrow on the fermion line 
keeps track of the charge or better of the particle versus anti- 
particle degrees of freedom of the field. 


as on the theoretical, calculational side, where we have 
to distinguish the numerical methods from the highly auto- 
mated algebraic manipulation technology. 


The language of Feynman diagrams. At this point the 
diagrammatic language created by Feynman becomes an 
indispensable tool. Let me give you an impression of how 
this methodology works. We have mentioned the (in- and 
out-coming) states, and these are represented as lines en- 
tering or leaving the diagram, where each particle type 
has its own type of ‘propagation’ line as illustrated in Fig- 
ure 1.4.28. The interactions are represented by diagrams 
where the particles that interact come together at a vertex. 
For example in Figure 1.4.29 we see an electron emitting 
or absorbing a photon, where the electron moves on but 
with a different momentum. 


The theory is relativistically invariant which means that you 


can make space-time ‘rotations’ or Lorentz transformations. 


This implies that you can also rotate the diagram clockwise 


Figure 1.4.29: Interaction vertex. The unique QED interaction 
is given by the interaction vertex of a photon with a charged 
particle, like an electron or quark. The strength of the coupling 
equals the coupling constant ‘e’. 


over 90 degrees (as in Figure 1.4.30 on the left), and you 
get the diagram for a photon coming in and an electron 
coming out and — help — what is that? It looks like an elec- 
tron moving backwards in time! You may think so, but that 
is indeed what a positron is. A negative charge (electron) 
moving backward in time is the same as a positive charge 
moving forward in time, because that is the way the Dirac 
equation works. At the vertex — the red dot, the interac- 
tion takes place with a strength of the charge e, and in the 
interaction the energy, momentum and charge have to be 
conserved. So, what goes in, has in some form to come 
out again. If you had rotated the diagram counterclockwise 
instead (as in the same figure on the right), you would have 
obtained a diagram representing electron-positron annihi- 
lation into a photon, and indeed the total charge is zero at 
all times. 


As far as QED is concerned these are roughly the fun- 
damental rules but the diagrams may become arbitrarily 
complicated. 
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Figure 1.4.30: Interaction vertex. The rotated diagram gives 
the coupling for the creation of an electron-positron pair on the 
left, and the annihilation of an electron-positron pair on the right. 
The convention is that in the diagram time goes upward. Note 
that the electron (negative charge) moving backward in time is 
the same as a positron (positive charge) moving forward in time. 


That is the quantessence of the trade: you know what 
comes in and presumably also what comes out, and then, 
in principle, you have to construct all possible diagrams 
that — obeying the rules — can be drawn in between. You 
can of course order the diagrams by the number of vertices 
they have and if the coupling is small, then the contribution 
of higher-order diagrams becomes ever smaller. So, you 
stop after a few orders and get a sufficiently accurate re- 
sult. You calculate the diagrams one by one and then add 
the results to obtain what is called the total quantum prob- 
ability amplitude for the process. 


As the word probability amplitude suggests, you have to 
square this expression to obtain the probability for the pro- 
cess to take place. In Figure 1.4.31 we for example give 
one the (two) leading, lowest order diagrams that contribute 
to the probability amplitude for electron-electron scatter- 
ing. What | am trying to convey is that the diagrams fur- 


Figure 1.4.31: 
exchange diagram contributing to the amplitude for electron- 
electron scattering. 


Photon exchange. A lowest order photon 


nish a powerful and precise symbolic language which rep- 
resents an intricate mathematical structure. The Feynman 
rules give you the unique translation of the diagrams into 
complicated but very explicit mathematical expressions that 
then have to be evaluated (mostly by computer) to get the 
real probabilities out. 


Doing precision measurements means doing critical pre- 
cision tests on theoretical models and that is the core of 
empirical science. Realistic precision calculations may in- 
volve hundreds or thousands of diagrams, and so even 
the generation of all the allowed diagrams is done by com- 
puter. In this field a lot of pioneering work in symbolic ma- 
nipulation by computers has been done. Nobel laureate 
Martinus Veltman was the first with his program named 
‘Schoonschip’ which literally means ‘clean ship’, though in 
Dutch it actually means ‘cleaning up the mess.’ This pro- 
gram has gone through many upgrades and extensions 
and is still a program used by many practitioners. The 
most well-known outcome of such physics inspired arti- 
ficial intelligent systems is the magnificent and versatile 
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symbolic manipulation and graphics platform Mathematica 
developed by Stephen Wolfram. 


Subnuclear structure 


The Standard Model 


The Standard Model is a theoretical model for the basic 

constituent particles of all ordinary (meaning, not-dark) mat- 
ter. Think of the leptons such as the electrons and neu- 

trinos, and the quarks that make up the protons and neu- 

trons, but think also of the force-carrying particles that bind 

the constituents together. The model gives a unified de- 

scription of three of the four fundamental forces: the elec- 

tromagnetic, and the weak and strong nuclear forces. Grav- 
ity, however, is not included in the Standard Model. The 

model has made numerous precise predictions that so far 

have been vindicated by a variety of large-scale experi- 

ments in the world’s biggest accelerators. 


The Standard Model was completed in the early 1970s. 
The experimental verification of many of its predictions took 
another forty years and still continues. A landmark was the 
discovery of the W and Z bosons at CERN (and somewhat 
later at Fermilab) in 1983. Another highlight was the dis- 
covery of the Higgs particle at CERN as recent as 2012. 
It was the last missing entry in the particle table of the 
model. The Higgs particle is a unique ingredient because 
it provided the explanation for the mass of other particles, 
in particular the masses of the weak force carrying W and 
Z particles. The presence of these masses is reflected in 
the fact that the corresponding interactions are short range 
as we discussed in the section on nuclear potentials and 
the Yukawa potential on page 166. 


t + 


oce üO 


Figure 1.4.32: This work of the Belgian surrealist painter René 
Magritte is entitled Les Jeunes Amours (1963). A more pro- 
saic title, well fitting our sub-nuclear narrative would have been 
A Color Triplet of Apple Quarks. There is even a Dirac sea in 
the background. (Source: ©‘Photothéque Magritte / Adagp Images, 
Paris) 


Flavors, colors and families 


To understand the structure of the Standard Model, let us 
look at Figure 1.4.35, and explain what information is en- 
coded in the colorful tables. In each of the figures, the 
top panel contains the force-mediating particles and the 
Higgs particle, these are all bosons, i.e. they have an in- 
tegral spin. We shortly describe these in detail but it is 
more convenient to first turn to the content of the lower 
panels. 


Particle families. The lower panels list the constituent 
particles: these are all fermions, and have spin one-half. 
There are three families of constituent particles denoted by 
three different colors as depicted in Figure 1.4.35(b). Only 
the first family is stable, it consists of the up and down 
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quarks and the electron and its neutrino. These are the 
building blocks that make up all forms of stable (ordinary) 
matter in our universe. The other families consist of heav- 
ier but unstable copies of the light family, and sure enough 
they also play a crucial, albeit more hidden, role in our uni- 
verse, such as processes inside stars or in the early uni- 
verse. We first look more closely at the quarks and there- 
after at the leptons. 


Quarks: flavors and colors. The first thing to note about 
quarks is that they carry fractional electric charges. Those 
in the left column have a charge +2e/3, wheras the ones 
in the right column have —e/3, so that the proton which 
corresponds to two ‘ups’ and one ‘down’ has indeed a 
charge e , while the neutron made up from one ‘up’ and two 
‘downs’ has zero charge. Besides their spin and charge, 
we distinguish two other intrinsic properties that were brief- 
ly mentioned before: flavor and color. 


Flavors. The so-called flavor index corresponds to one of 
the six letters (u, d,s, c, t, b) ’ which in turn refers to their 
names up, down, strange, charm, top and bottom. 


Besides the lightest nuclear particles or hadrons like the 
proton, neutron that make up stable matter, there are many 
nuclear particles that also involve quarks of flavors other 
than the up and down, but as those quarks are heavier, 
the particles in which they appear tend to be unstable. By 
the way, | always found the use of the word ‘flavor’ in this 
context a bit strange. What is ‘up’ or ‘down’ supposed to 
taste like, you wonder. The peculiar collection of flavor 
names for quarks has repeatedly given rise to exotic if not 
funny, even sexist expressions in titles of articles (involv- 
ing topless or bottomless particle models etc.), which after 
submission were of course instantly refused by the editors 
of the established journals. 


Flavor symmetry: the ‘eightfold way. From a historical 
point of view it is interesting to restrict ourselves to the 
three-flavour case. It is the case described by Gell-Mann 


| K 
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Figure 1.4.33: The SU(3) way. Gell-Mann’s eightfold way is 
based on the symmetry group that classifies the flavor proper- 
ties of the particles making up the ‘particle zoo’ The geometric 
patterns by which the particles are labeled actually have a three- 
or sixfold symmetry rather than an eightfold one. The observed 
particles are the ones on the outer hexagon and the three par- 
ticles at the origin, together they form the meson nonet. The 
symmetry has two fundamental three-dimensional representa- 
tions corresponding to the two triangles in the center, which 
suggested the existence of three quarks (u, d, s) and their anti- 
particles (i, d,s). 


in his ‘eightfold way’, based on a SU(3) flavor symmetry 
group. To give you a flavor we have depicted some of 
the geometric representations in which the particles are 
classified according to this SU(3) scheme in Figure 1.4.33. 
This representation is called the meson nonet, referring to 
the nine possible quark anti-quark combinations of the up, 
down and strange flavors, which gives 3 x 3 = 9 combi- 
nations. This representation is one of the examples that 
makes up the aforementioned ‘particle zoo’ of nuclear par- 
ticle states. The fundamental particles form a triplet repre- 
sentation of quarks and an anti-triplet of anti-quarks, and 
these correspond to the blue and red triangles in the cen- 
ter. 
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Figure 1.4.34: Quarks and SU(3). A beautiful early Islamic 
tiling from the Real Alcazar (Royal Palace) in Sevilla in Spain, 
exhibiting the sixfold symmetry characteristic for the group 
SU(3). The black stars represent the weight-lattice of states cor- 
responding to the group SU(3) The representations correspond 
to certain triangular or hexagonal subsets of states centered at 
the origin. 


Quarks have never been observed as individual, freely mov- 
ing particles; they are confined to composites that con- 
sisted of quark anti-quark pairs (the mesons) or (anti) quark 
triples (the baryons). The nonet consists of nine mesons 
made up of one of the three quarks paired with one of the 
three anti-quarks in the figure. These scalar (spin zero) 
particles, are, as you see, siblings of the pions we have 
mentioned before. The three particles in the center corre- 
spond to different linear combinations of the tu, dd and 5s 
pairs. Why the ‘eightfold way’ you are inclined to ask, while 
the picture clearly exhibits a sixfold symmetry? It turns out 
that eight of the nine mesons basically form an octet, that 
is a larger irreducible representation of the group, mean- 
ing that under the SU(3) transformations those particle 
states would be transformed into each other. The ninth 
(n’) particle is all by itself and invariant under the symmetry 
group. In the era when this scheme was proposed all the 


observed particles could be catalogued in certain SU(3) 
representations (like the octet mentioned before), and this 
of course shifted the quest for fundamental building blocks 
to the underlying level of the quarks. 


If the symmetry was exact, then that would imply that the 
baryons or mesons that belong to a single representation 
of the symmetry group should have the same mass; the 
particles should be degenerate. This turns out not to be 
the case here, and therefore we say that flavor is only an 
approximate symmetry. Nevertheless having the symme- 
try patterns and the observed particles and their masses, 
Gell-Mann could see that certain particles were missing 
from the observations, and that way he could make quite 
precise predictions of their properties and therefore also 
say where to look for them. An example is the Q~ particle 
belonging to the decuplet representation of baryons and 
discovered in 1964. This is indeed reminiscent of the story 
of Mendeleev and his periodic table. 


It is actually somewhat ironic that the SU(3) or the much 
larger SU(6) flavor symmetry does not really feature in the 
Standard Model as we see from the panels in the figure. 
The flavors are there but they come in pairs, which refers 
to the weak interactions as we will explain shortly. The 
‘eightfold way’ is completely accidental from the Standard 
Model perspective. Nevertheless the family structure is 
very much present and even required for the consistency 
of the model. But that family structure as such is not ex- 
plained by the model. It is one of the challenges to look for 
yet more involved schemes. 


Color. The second property of quarks refers to what is 
called their color. Each flavor comes in three different ‘col- 
ors’, usually denoted as red, green and blue. In the figure 
that is visualized by the stack of three quark panels on 
top of each other. This ‘color quantum number is some 
kind of charge to which the strong nuclear force couples, 
and needless to say, has nothing to do with ordinary color. 
This nomenclature is at least consistent, which cannot be 
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THE STANDARD MODEL 


THE STANDARD MODEL 


(a) The constituent particles, quarks and lep- (b) The constituent particles come in three families. All ordinary 
tons are fermions. The particles mediating the matter is made of the first family of lightest and therefore stable 
forces are bosons, and he Higgs boson gener- particles. 


ates mass for the particles. 


(c) The electromagnetic force mediated by (d) The strong nuclear force only medi- (e) The weak nuclear force affects all con- 
the photon affects all particles that carry ates between the three different colors of stituents, but it does not mix quarks with 
electric charge. quarks and does not distinguish flavor. It leptons within a family. 

binds quarks into color neutral nuclear par- 

ticles. 


Figure 1.4.35: The standard model. Constituent particles and how the basic forces act between them. 


190 


CHAPTER 1.4. THE QUEST FOR BASIC BUILDING BLOCKS 


said of the flavors. 


The excellent ‘artist impression’ of flavor and color prop- 
erties of quarks is given by the Magritte painting of Fig- 
ure 1.4.32. The painting dates from 1963, one year before 
Gell-Mann and Zweig proposed the existence of quarks, 
but well before the color property of quarks was postulated, 
implying that the quite striking correspondence is entirely 
coincidental. 


Leptons: electrons and neutrinos. In the bottom pan- 
els on the right in Figure 1.4.35, we have listed the three 
families of leptons: the electron, the muon, and the tau 
family, including their respective neutrinos. The neutri- 
nos have pretty ghostly properties in that they have no 
charge and were long believed to be massless. It has 
quite recently been established, however, that they have 
tiny masses. They only interact weakly (and gravitation- 
ally), which means that we don’t see or feel them, in spite 
of the fact that we are permanently bombarded by billions 
of these neutrinos per second. They basically fly unhin- 
dered through most things, like the Earth for example. The 
evidence for their existence was for a long time just based 
on their absence, since the amount of missing energy and 
momentum in weak-decay processes pointed to the exis- 
tence of a massless, neutral particle — a neutrino there- 
fore. A tiny brother of the neutron. To catch a few of 
them we have to build detectors consisting of an incred- 
ible number of steel plates with very special (so-called flat 
wire chamber) detectors in between, and that is how af- 
ter a long time their existence was established in a direct 
fashion. The electron neutrino was the first to be discov- 
ered in 1956 by Frederick Reines and his collaborators. 
He shared the 1995 Nobel prize for Physics for this dis- 
covery with Martin Perl who discovered the tau-neutrino 
in 1974, quite some time after Leon Lederman, Melvin 
Schwartz and Jack Steinberger received the prize for the 
muon-neutrino in 1988. 


The matching of the lepton and quark panels in the fig- 


ures is essential for the consistency of the model. But 
it is not known whether the family structure can be ex- 
plained by some underlying mechanism, where the dif- 
ferent family levels are excited levels of some underlying 
structure. 


Force mediators. In the top panels we see the force me- 
diating particles and the Higgs particle. The force carri- 
ers have spin one, which means that they are vectors like 
the electromagnetic gauge potentials. The Higgs has spin 
zero; it is a scalar particle without spin degree of freedom. 
To see what interactions these force particles mediate, it 
is best to look at the three figures at the bottom. On the 
left in Figure |.4.35(c), we have the familiar electromag- 
netic interaction mediated by the photon denoted by y, 
which is described by the QED part of the Standard Model. 
Electromagnetism only affects the blue-colored particles, 
which are the particles that carry electric charge. Note, 
that all constituent particles carry charge except the neutri- 
nos. And that is exactly the reason we can’t observe them 
very easily. The force particles (including the photon itself) 
are electrically neutral except for the W= particles which 
carry a unit of charge. In the middle figure, we display 
the strong nuclear force which only works between quarks 
mediated by the gluons denoted by g. This brings us to 
the theory of the strong interactions to which we now turn. 
We will discuss the weak interactions of Figure 1.4.35(e) in 
more detail thereafter. 


The strong interactions 


Quantum chromodynamics (QCD). The quantum theory 
for the strong nuclear force is called Quantum Chromody- 
namics (QCD). The strong force is mediated by 8 gluons, 
which are described by 8 color gauge potentials, that man- 
ifest themselves in the presence of 8 ‘color-electric’ and 
‘color-magnetic fields.’ 
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Figure 1.4.36: Self-interaction of gluons. The photon (y) has 
no self-interaction. The gluons (g) have self-interactions, which 
makes the theory very much nonlinear and much harder to deal 
with. The three gluon vertex is represented differently on the left 
of the next figure. 


Self-interactions. A crucial difference with QED is that the 
gluons themselves also carry color charge like the quarks. 
This means that they interact with themselves and that un- 
derstandably leads to complex nonlinear behavior. The 
reason why electromagnetism is so much simpler is pre- 
cisely that the photon does not carry electric charge and 
therefore does not interact with itself. This means that pure 
electromagnetism without charges and currents is a linear 
theory and indeed the source-free Maxwell equations are 
linear as we saw in Chapter 1.1. These have simple sinu- 
soidal wave-like solutions, which on a quantum level corre- 
spond to freely propagating photons. The essential differ- 
ence between the non-self-interacting photon and a self- 
interacting gluon is indicated in Figure 1.4.36. In addition, 
the effective strength of this color coupling is large, so it 
is hard to make successive approximations to higher order 
in the coupling. The language of Feynman diagrams loses 
much of its power because it is an approximation scheme 
that involves successive powers of the coupling constant. 


Figure 1.4.37: Color-flow diagram in QCD. A nice way to visu- 
alize the interactions in QCD. Quarks carry a single color line, 
while gluons carry two (different) lines. In the vertices the color 
charge is conserved, so, the colors and arrows have to match. 
Upper index goes into the vertex, lower index goes out. 


If that coupling is small the series is expected to converge 
and it suffices to only keep a limited number of lower or- 
der contributions to obtain a meaningful result. If that cou- 
pling becomes large the successive contributions keep in- 
creasing and one loses the convergence and hence the 
ability to make meaningful calculations and reliable predic- 
tions. 


An alternative way to think about gluons, quarks and the 
way they interact with one another, is given in Figure 1.4.37, 
where the (anti-)quarks are denoted by a single directed 
color line, and the gluons as an oppositely directed pair 
of lines. The picture is illuminating in that it shows very 
clearly what it means to say that color (charge) is locally 
conserved. The figure is not meant to imply that the gluons 
are actually made up of (anti-)quarks. Though they can 
manifest themselves in the same color anti-color ‘chan- 
nel’, the gluons represent independent physical degrees 
of freedom. The fact that strong self-interactions lead to 
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Figure 1.4.38: Free electric charges. The electric field lines 
connecting the arrows go from the positron to the electron but 
they spread out widely over space. The electric charges are 
therefore not confined and we can observe them as free parti- 
cles. 


unexpected behavior may not sound unfamiliar to us hu- 
mans. Anyway, it is this feature in QCD that made it so 
hard to see from the basic structure of the theory what the 
resulting physics and phenomenology of quarks and glu- 
ons would be. 


Confinement. The binding mechanism between two quarks 
is very different from the attraction between to opposite 
electric charges. This is illustrated in Figures 1.4.38 and 
1.4.39. In the first figure we see that the electric field be- 
tween two opposite charges spreads over all of space, re- 
flecting the 1/r? force law. Itis as if the field lines repel 
each other. In this case we can give one of the charges 


enough energy that the pair breaks up into two free charges. 


The second figure shows what the color-electric fields be- 
tween a quark and an anti-quark look like. The field lines 
are squeezed into a narrow tube that connects the pair. It 
is as if the field lines attract each another. The energy per 
unit length of the electric flux tube is constant because the 


Color charges are confined. 


Figure 1.4.39: Confined color charges. In QCD the color elec- 
tric fields do not spread but are forced into a narrow tube which 
leads to the confinement quarks. It is a consequence of the 
highly non-trivial nature of the ground state of QCD which be- 
haves like a color magnetic superconductor. 


tube is everywhere the same. This in turn implies that the 
interaction energy of the pair grows linearly with their sep- 
aration. It would increase indefinitely if not the energy at a 
certain point exceeds the energy needed to create a new 
quark anti-quark pair somewhere in between. Then what 
we basically have done is to create two pairs out of one 
pair! We cannot create a separate quark. because as a 
source it always has to stay connected to a tube. 


More in general it turns out that the color force works in 
such a way that only color neutral composites of quarks, 
denoted as ‘color singlets; can exist as free particles. And 
therefore these are the nuclear particles that we observe 
in nature. The way this usually is expressed is to say 
that color is confined. The property of color is hidden. 
Quarks and gluons are for ever emprisoned. The simplest 
singlets are either made-up of three quarks with different 
colors (these are the baryons like the proton and neutron 
mentioned before), or of color anti-color quark pairs (the 
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mesons like the pions). And the same is true for the gluons 
themselves, because they carry color charge; they only 
appear in color neutral composites which are called glue 
balls. It is this constraint of color neutrality that explains 
quarks and gluons are confined and why we cannot ob- 
serve them as free individual particles like electrons. 


The confinement phenomena presents us with a unique, 
paradoxical situation we did not encounter before. QCD 
is a theory formulated in terms of fundamental physical 
degrees of freedom (quarks and gluons) that are not dis- 
cernible, so that from a philosophical point of view you are 
tempted to question their very existence. ‘To be or not to 
be, that is the question!’ 


Asymptotic freedom: how strong becomes weak. What 
does it mean to say that interactions are strong? In this 
case it is a relative statement in that it is a force between 
protons and neutrons, or on a more basic level between 
quarks, that is strong enough to overcome the Coulomb 
repulsion so as to make nuclear binding possible. This 
implies that the coupling strength of the interaction is con- 
siderably larger than that of electromagnetism. What we 
mean to say is that the effective dimensionless number 
characterizing the strength of the interactions must be much 
larger, and this amounts to saying that the analogue of the 
electromagnetic fine-structure constant « = e*/(47fic) ~ 
1/137) , which for the strong interactions is called œs, is 
of order unity. This tells us that at the relevant nuclear 
scale of 1 fermi = 107" m the effective coupling is 
large. 


The confinement picture 1.4.39 shows that the color fields 
emanating from the quark are forced into a narrow tube 
that terminates at some antiquark. The tube has a cross- 
section which is typically of the confinement scale, say, 
one fermi squared. So here is how we should think about 
this. For distances much larger than one fermi, the quarks 
are confined, which means that the complicated nonlinear 
self-interactions of the gluons have collectively created an 


effective environment that causes the confinement. How- 
ever, for distances much smaller than one fermi the quarks 
are effective moving ‘freely, in the sense that the effects of 
the self-interactions are negligible. In fact on such scales 
one could treat the strong interactions more like a type of 
electromagnetic interactions. The color field lines go radi- 
ally out of the quark and bend over in the confining tube 
at a distance of about one fermi. On that small scale one 
could use the perturbative approach in terms of Feynman 
diagrams to calculate the dynamics. 


It is interesting to look at the result of such intricate cal- 
culations of the effective coupling strength as a function of 
momentum transfer (or inverse distance) both for « (QED) 
and &s (QCD). For QED one obtains, 


Xx 
~ [1 = (0/37) In(q2/m2)] 


for q? > mŽ, 
(1.4.30) 


where « = 1/137 and m denotes the relevant mass scale 
one is interested in. For QCD one obtains, 


127 
(33 — 2f) In(q2/A2) 


x(q?) 


«;(q*) = for q? > A? (1.4.31) 
where f = 6 equals the number of flavors and A sets a 
mass scale at which one is interested. We have plotted 
these curves in Figure 1.4.40. There are two striking differ- 
ences between the two curves: (i) the relative difference 
in strength on the scales we are interested in is indeed 
big, about a factor of one hundred, and (ii) the strong inter- 
action is decaying substantially for increasing momentum 
and thus for smaller distances. The strong interaction gets 
weak at small distances! This property is called asymptotic 
freedom. Ìt is of crucial importance because it allows for 
precise calculations of high-energy scattering processes 
where you probe very small distances and compare those 
to the experiments. 


So what happens if two quarks collide head-on in a col- 
lider? They may strongly scatter and the outgoing quarks 
or gluons may get a high transverse momentum. These 


194 


CHAPTER 1.4. THE QUEST FOR BASIC BUILDING BLOCKS 


a= a.(q) 


Smaller distances ———> 


100 1000 [GeV] 
I i 


Figure 1.4.40: Asymptotic freedom. Plots of the effective cou- 
pling strength as a function of momentum (or probing distance) 
for QED and QCD. Note the scales differ by a factor one hun- 
dred. The fact that the blue QCD strength becomes weaker at 
short distances is called asymptotic freedom. It means that in 
the high-energy regime the theory can be studied by the dia- 
grammatic (perturbation theoretic) method. 


individually colored particles have to pick up companions 
to make color singlets at a scale of one fermi and will in 
the end cause a so-called jet of outgoing singlet particles. 
This highly collimated shower of particles has a total mo- 
mentum equal to that of the originally scattered quark, so 
that individual quark momentum is an observable in the 
above sense. 


We can also turn the story around and start at very small 
distances where the theory is very well behaved and our 
intuitions make sense because the system is weakly cou- 
pled. If we move up in scale towards the infrared the cou- 
pling becomes stronger, and when the coupling becomes 
of order unity the system becomes strongly coupled and 
our predictive ability breaks down. Now what this quite of- 
ten means is that something drastic like a phase transition 
is going to happen. The ground state of the system be- 


comes unstable and will change. For example a non-trivial 
condensate may form, and in fact in a sense the nature of 
the condensate in QCD is quite well understood and inves- 
tigated (by computer simulations). The idea is that there is 
a condensate of magnetic degrees of freedom, monopoles 
and fluxes, so that the ground state of QCD is very much 
like a magnetic superconductor, a medium which would 
indeed confine color electric charged particles, like quarks 
and gluons. 


To get some understanding of this mechanism we should 

look at ordinary (electromagnetic) superconductivity (type 

Il) which will be discussed in Chapter III.3 in Volume III. 

The ground state of a ordinary superconductor corresponds 
to a condensate of electron (so-called Cooper) pairs. These 
cause the so-called Meissner effect, which means that mag- 
netic fields are expulsed from the medium. If you turn ona 

strong magnetic field over a slab of superconducting mate- 

rial, then thin filaments of one unit of flux will penetrate the 

superconductor. Now imagine that | have magnetic mo- 

nopoles to play around with and suppose that | drag that 

monopole into the superconductor, what would happen? 

Indeed, the magnetic flux of exactly one unit emanating 

from the monopole would be forced into one such narrow 

filaments and look for a way out at the boundary of the su- 

perconductor where the field would spread out again. But 

that is nothing but saying that monopoles would be con- 

fined in such a superconducting medium! And the dual 

of this mechanism is operative in QCD, a color-magnetic 

condensate confines the color-electrically charged quarks 

and gluons. 


A final comment on this beautiful theory. Can we not in 
some way reformulate the theory in what we call a strong 
coupling regime where one over the coupling constant is 
the new coupling, which then can be taken to be small. 
This question was answered by Kenneth Wilson from Cor- 
nell University, and it amounted to a formulation of gauge 
theories on a discrete space-time lattice. in terms of link- 
variables like the ones we considered on page 35 in Chap- 
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Figure 1.4.41: Lead-ion collisions. A simulation of a lead-ion 
collision event for the ALICE detector at CERN, producing an 
enormous number of particles. To find what you are looking 
for is far worse than searching for a needle in a haystack! In 
these experiments one tries to recreate the conditions that were 
present troughout the universe shortly after the Big Bang. 


ter 1.2, where we discussed the line integral of a gauge po- 
tential. In this formulation one can make a very controlled 
and systematic strong coupling approximation to QCD, and 
the most immediate success is that confinement is there 
right from the start. This means that in this approach the 
lowest order calculation of the interaction energy between 
two external quarks yields a linear potential between them, 
and that is what confinement means. In Figure 1.4.39 we 
see that the field energy per unit length is constant. Then 
the question became to prove that there was no disconti- 
nuity (a phase transition) between the weakly coupled and 
strongly coupled regimes. This turned out to be the case 
and with that the lattice approach to QCD has become an 
indispensable tool in the study of the strong interactions. 
Wilson was awarded the physics Nobel prize in 1982 for 
his profound work on phase transitions, which is embod- 
ied in his fundamental work on the renormalization group, 


Figure 1.4.42: The Large Hadron Collider at CERN. The largest 
accelerator at this moment is the Large Hadron Collider (LHC) at 
CERN in Geneva. The protons are accelerated in two oppositely 
directed circular beams. The circumference of the large ring is 
27 km. Pre-acceleration happens in the older Proton Synchro- 
ton (PS) and the Super Proton Synchroton (SPS) accelerators. 


a very general approach to studying the scaling properties 
of physical systems that we will return to in Chapter III.4. 
This work established a deep connection to the work on 
phase transitions in statistical and condensed matter phys- 
ics by Michael E. Fisher and Leo Kadanov, and the renor- 
malization program in quantum field theory going back to 
the early days of QED. 


The quark-gluon plasma. \f we shoot two protons with 
very high-energy onto each other, they surely break up, 
and what comes out are avalanches (called jets) of color- 
singlet particles — nuclear, but also leptons. Indeed in mod- 
ern experiments the energies are so gigantic that thou- 
sands of particles are created in a single collision, as in- 
dicated in Figure 1.4.41 showing (simulation of) a high en- 
ergy event in the ALICE detector of CERN. In this exper- 
iment the physicists are trying to create a new high den- 
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Figure 1.4.43: Beta-decay. The diagrams for the beta-decay 
process. On the left a diagram at the level of nuclear physics, 
and on the right the resolution of the same diagram at the finer 
scale of the Standard Model. Note that the single vertex on 
the left corresponds to a pair on the right where the process is 
involving a W~ particle mediating the weak nuclear force. 


sity state of matter denoted as the ‘quark-gluon plasma’, 
a state that may have existed in the very early universe 
directly after the Big Bang. They do that by banging lead- 
ions with very high energy into each other so that thou- 
sands of new particles are created, and for a fraction of a 
second these form a strongly interacting hot plasma made 
up of quarks and gluons with striking properties that should 
resemble the state of matter at the very early stages of the 
universe. 


The electro-weak interactions 


The W and Z particles. Let us return to the tables repre- 
senting the Standard Model and to the elecrtro-weak in- 
teractions in particular. In Figure |.4.35(e) we focus on 
the weak nuclear force, mediated by the charged W= and 


Figure 1.4.44: Higgs production. A standard model diagram 
representing a particular process by which the Higgs particle is 
produced from the scattering of two quarks. The experimental 
signature of this process is provided by the two tau leptons in 
which Higgs instantly decays. 


neutral Z particles. It affects all constituent particles in 
an interesting way, the W bosons induce horizontal tran- 
sitions in the table, because they are electrically charged. 
Their interaction vertices, allow for fundamental processes 
like: 


utw > åd, (1.4.32) 
e+Wt > ve, (1.4.33) 
wee <b, (1.4.34) 


The horizontal moves stay within the (color)panels, so red 
quarks to red quarks, electron to its neutrino and so on. A 
transition from a lepton to a quark is not possible because 
the W bosons have unit charge and that doesn’t match 
the fractional difference in charge between a quark and a 
lepton. This in turn implies that the net number of quarks 
and the net number of leptons are separately conserved 
in these interactions. Take for example the process of ‘B- 
decay’ of the neutron where: 


Nn pt+et+Ve, (1.4.35) 
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as depicted on the left in Figure 1.4.43. Recalling the com- 
positions n = (udd) and p = (uud), the decay process 
above is in the Standard Model perspective composed of 
two non-trivial vertices on the constituent level: 


d => u+W , 
W > ete, 


(1.4.36) 
(1.4.37) 


and this process is depicted on the right-hand side of the 
figure. The ‘weakness’ of the transitions comes about be- 
cause the probability of creating an intermediate W and Z 
particle is very low due to their large mass. That results 
in an energy barrier which suppresses the transition pro- 
cess. 


The Higgs particle. Finally, in Figure 1.4.44, we show the 
complicated diagram of a process that contributes to the 
production of a Higgs particle (H) in the collision of two 
protons (or better, two quarks), where these exchange a 
weak W/Z boson, which can radiate from a Higgs. This 
Higgs is extremely short-lived and is not directly observed. 
The signature of the Higgs production in the out-coming 
state is the presence of two t leptons. The Higgs was 
found in 2012 by two large international experimental col- 
laborations: ATLAS and CMS in the Large Hadron Collider 
(LHC) at CERN. 


The Higgs particle is an essential ingredient of the stan- 
dard model as it is involved in a mechanism by which the 
masses of the W and Z particles are generated. This is 
discussed in more detail in Chapter II.6 on symmetries and 
their breaking. 


This concludes our lightning review of what the cherished 
Standard Model of particle physics is about. In the next 
section we further explore the unification process in the 
successive formulations of fundamental physics at the sub- 
sequent stages of understanding. 


A brief history of unification. 


There are two possible outcomes: if the result con- 
firms the hypothesis, then you’ve made a measure- 
ment. If the result is contrary to the hypothesis, 
then you’ve made a discovery. 

Enrico Fermi 


We have so far talked mainly about the fundamental build- 
ing blocks and that translates into an inventory of what has 
been observed in experiments up to now. We have also 
reflected on the models for the interactions between these 
building blocks that account for the spectrum and the hier- 
archy of physical states. It is then interesting to step back 
and look at the history of theories, which is indeed also a 
history of concepts in theoretical physics. In Figure 1.4.45 
we have depicted this historical account focussing on the 
unification concept. 


On the bottom line we list the basic classes of physical 
phenomena concerned, and going upward we also ob- 
serve how they are linked to the fundamental forces, but 
we see also a progressing unification in the description of 
the fundamental physics. The two lines at the top repre- 
sent theoretical developments which are still considered to 
be speculative and for which we eagerly await new experi- 
mental clues. This figure nicely illustrates the fundamental 
paradox of how ultimate reductionism may well lead to a 
form of ultimate holism! 


Returning to the unification aspect, the first example is 
Newton’s theory of gravitation (1687) that unified heavenly 
and terrestrial mechanics. Another beautiful example is 
provided by Maxwell’s theory of electromagnetism (1865), 
which clearly unites electric and magnetic phenomena in 
one framework, but also includes electromagnetic fields 
and radiation like light, and therefore the subject of op- 
tics. 
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After the Second World War we learned to appreciate and 
include the quantum principles as in Quantum Electrody- 
namics (QED), the theory of photons, electrons and their 
anti-particles the positrons. This highly successful theory 
motivated theorists to find theories for the weak and strong 
nuclear forces based on similar principles. 


The starting point was an approximate phenomenological 
model for the weak force proposed by Fermi in 1932. In the 
late 1960’s this was replaced by a consistent unified quan- 
tum theory for both the electromagnetic and weak forces. 
Many names are actually connected to this development, 
firstly Sheldon Glashow, Abdus Salam and Steven Wein- 
berg, who formulated the theory including the particle con- 
tent including the weak force mediating particles. For this 
work they shared the Physics Nobel Prize in 1979. 


This model was then augmented with the all-important in- 
gredient of the Higgs field by Peter Higgs, Robert Brout 
and Francois Englert. Brout died in 2011, and therefore 
Higgs and Englert shared the Physics Nobel prize in 2013, 
shortly after the particle was discovered at CERN. Finally 
we should mention the seminal contributions of Gerard `t 
Hooft and Martinus Veltman, who constructed the consis- 
tent mathematical framework which enabled them to prove 
that the electro-weak theory was renormalizable, and which 
made comparisons of detailed predictions of the electro- 
weak theory with precision experiments possible. They re- 
ceived the 1999 Nobel prize for Physics for this work. 


The developments for the strong interactions took place 
partly at the same time. Chen Ning Yang and Robert Mills 
proposed in 1954 the fundamental generalization of the 
Maxwell theory by extending the notion of the electromag- 
netic gauge invariance from the simple U(1) group to the 
non-abelian group SU(2). This led to a totally new, very 
beautiful non-linear system of equations, not surprisingly 
called the Yang-Mills equations. But it took quite some time 
before it was recognized that these equations formed the 
basis for the theories of both the strong and weak nuclear 


forces. In the section on Gauge symmetries of Chapter 
II.6, we discuss these symmetries and equations in more 
detail. 


One of the leading scientists in the particle physics devel- 
opments was the American Murray Gell-Mann who pro- 
posed the existence of quarks at the same time as but 
independently from George Zweig in 1964. Gell-Mann re- 
ceived the Physics Nobel prize for this and other contribu- 
tions in 1969. After that he also formulated Quantum Chro- 
modynamics (QCD) with his collaborators Heinrich Leut- 
wyler from Switzerland and Harald Fritzsch from Germany 
in 1973. This theory is based on the Yang-Mills equa- 
tions for the color gauge group SU(3). The binding mech- 
anism and confinement of quarks was largely proposed 
by Yoichiro Nambu who received the Nobel prize in 2008. 
The property of QCD called asymptotic freedom made it 
possible to make sensible predictions for the strong inter- 
actions at high energies, This was discovered in 1973 by 
the American physicists David Gross, David Politzer and 
Frank Wilczek who received the Nobel prize for their work 
in 2004. We will say more about this shortly. 


Forces of nature, Unite! Let me once more emphasize 
that the unification in the description of such a wide variety 
of physical phenomena in the Standard Model was pos- 
sible because the different components are based on the 
same conceptual principles. These principles are those of 
quantum theory, those of special relativity, and the prin- 
ciple of local gauge invariance. The latter principle mani- 
fested itself in Maxwell’s theory as we discussed in Chap- 
ters |.1 and 1.2, in Einstein’s general theory of relativity, and 
also in the Yang-Mills equations. Gauge invariance is a key 
ingredient because it is strongly tied-in with the notion of 
a force field and completely fixes what the interactions be- 
tween the forces and particles look like. 


In Figure 1.4.45 you see that we have added two more 
rows on top. They express some powerful ideas lead- 
ing to further unification, ideas that go beyond the Stan- 
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Figure 1.4.45: The well-established paths of unification that have led to the Standard Model, and conceivable paths beyond. 


200 


CHAPTER 1.4. THE QUEST FOR BASIC BUILDING BLOCKS 


KORCES OF NATURE 


Strong 


Weak 
Electromagnetic 


GRAVITY 


Figure 1.4.46: Forces Unite! A call for the United Forces of 
Nature. 


dard Model but have not (yet) been vindicated by experi- 
ment and should therefore be labeled as speculative. The 
Standard Model has left us with a number of open ques- 
tions that strongly hint in the direction of a single overarch- 
ing quantum gauge theory that comprises the three non- 
gravitational interactions. Models of such type are called 
Grand Unified Theories or GUTs, but the proposals made 
so far have not been very successful. An example of such 
a hint is that the electric charges of proton and electron 
match perfectly, but within the Standard Model there is no 
a priori reason that they should have the same magnitude. 
Except when there would be magnetic monopoles around, 
but these are not part of the Standard Model; however in 
GUTs they exist. Another hint is that the family structure is 
not explained; it may possibly result from some underlying 
structure. 


The theory of gravity remains a case apart. In spite of the 
tremendous successes of Einstein’s theory, it has so far 
withstood all attempts to make it consistent with the princi- 
ples of quantum theory. This is a highly non-trivial matter 
and seems to require a radical change of perspective. On 


the other hand it should not be too surprising: it is unique 
because it directly concerns the primary notions of space 
and time itself. 


The line of development starting around 1970 centered 
around a few additional concepts: the first is the notion 
of rigid supersymmetry, the second was that of local or 
gauged supersymmetry which gave rise to supergravity 
theories, and finally the basic step from point particles to 
extended objects like strings and so-called branes. We 
close this chapter by a lightning review of some of the 
salient features of these developments. 


Supersymmetry 


From bosons to fermions and back. The gauge symme- 
tries we have discussed so far transform certain particle 
types into each other. The SU(3) color group for example 
transforms the quarks of different colors into each other. 
The weak SU(2) transforms up and down quarks or elec- 
trons and their neutrinos into each other. But these gauge 
transformations always transform bosons into bosons and 
fermions into fermions. Supersymmetry is an intricate sym- 
metry which involves generators which themselves are fer- 
mionic with the crucial property that they transform bosons 
into fermions and back. It entails a drastic extension of the 
notion of symmetry. Its discovery and early development 
goes back to the early 1970s. If we call the super charge 
(or generator of the supersymmetry) Q, it has the following 
properties: 
QV =0 
Q |boson) = |fermion) ; Q|fermion) = |boson). 


One may add more supercharges, in which case we speak 
of extended supersymmetries. In four dimensions we have 
a maximum of M = 8 supersymmetries. The more super- 
symmetry the more constrained the theory will be. Like 
with other symmetries particle types fall into representa- 
tions of the various supersymmetry algebras and these 
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representations will contain both bosons and fermions. The 
smallest representation of M = 8 extended supersymme- 
try contains a spin-two particle which is a natural candidate 
for the graviton. So in that sense there is a fundamental 
link between supersymmetry and gravitation. 


The Dirac equation predicted the existence of anti-matter, 
and in a similar way supersymmetry predicts the existence 
of super mirror images of all the known particles. Bosonic 
particle species would have fermionic superpartners and 
vice versa. These partners are generally denoted as spar- 
ticles: squarks and sleptons, while the superpartners of 
the force particles are called gauginos, like the photino, 
the Winos etc. The corresponding fields are labeled by 
the same letters with a tilde on top and we have displayed 
some of them in Figure 1.4.49. In that figure they are de- 
picted as belonging to the massless sector of superstring 
theory that will be discussed shortly. 


A Minimally Supersymmetric Standard Model (MSSM). 
One thing we can conclude immediately is that unfortu- 
nately the presently observed bosons and fermions (the 
inhabitants of the Standard Model listed in Figure |.4.35(a)) 
cannot be each other’s superpartners, because the other 
properties do not match. It is like the situation with the 
Dirac equation where the proton could not be identified 
with the anti-electron because they have different masses. 
The proton (field) has its own Dirac equation. For a su- 
perpartner all intrinsic properties are the same except for 
the spin which differs by half a unit. So to make the world 
supersymmetric the very minimal thing one may do is con- 
struct the simplest VV = 1 supersymmetric extension of the 
Standard Model, and that means just doubling the pan- 
els of |.4.35(a) and put tildes on all the particle symbols. 
This Minimal Supersymmetric Standard Model (MSSM) is 
actively studied and a lot of effort is devoted to ‘hiding’ 
the unwanted partners and finding possible experimental 
signatures that show up in high-energy experiments. You 
see that, in particular with extended supersymmetries, one 
is forced to accommodate large numbers of new particle 


species. And to break the supersymmetry even more par- 
ticles have to be added. We will refrain from discussing the 
MSSM in more detail. 


The principal motivation to build the LHC at CERN was 
to find the Higgs particle, a crucial ingredient of the Stan- 
dard model that lacked experimental vindication. But the 
physicists had another deep motivation and that was the 
hope that the LHC would allow for the much more rev- 
olutionary discovery of supersymmetry as an underlying 
principle of nature. So far there has been no evidence for 
this. If the ‘sparticles’ are really there, they would make 
up a shadow world, which is extremely weakly coupled to 
our discernible world. Not having seen them up to now 
means that the supersymmetry would have to be badly 
broken in our universe, because breaking can give a con- 
siderable mass to the super partners. It is a bit like the 
‘Higgs breaking’ mechanism that gives mass to the W and 
Z particles that mediate the weak interactions in the Stan- 
dard Model. 


Yet, from another perspective it is not inconceivable that 
supersymmetry is a blessing in disguise. The lightest su- 
persymmetric particle is absolutely stable by construction, 
and it has been suggested that this lightest supersymmet- 
ric particle, for example the photino (the super partner of 
the photon), is a candidate for the elusive particle that 
makes up dark matter. It couples very weakly, is neutral 
and massive, and makes a perfect WIMP, a Weakly Inter- 
acting Massive Particle, that is favored in many cold dark 
matter scenarios. We briefly discussed this in the section 
on cosmology in Chapter 1.2 on page 76. What we may 
conclude at this point is that the discovery of superpart- 
ners in a lab like CERN or Fermilab would be a spectac- 
ular discovery in its own right, but would also put string 
theory (and supergravity) in a far more credible position as 
these theories predict their existence as a necessary in- 
gredient of nature. We have to wait and see. One of the 
reasons science is demanding, is that it requires so much 
patience. 
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Charge lattice 


electric 


Figure 1.4.47: Montonen-Olive duality. The electric-magnetic 
charge lattice of the M = 4 supersymmetric SU(2) Yang-Mills 
theory. This is in the weak coupling regime with on the horizon- 
tal axis a triplet of gauge bosons W% masses Mw = ef and 
Wo = y which remains massless. On the vertical axis there 
is the same massless photon and two magnetically charged 
monopole M=, these are solitons with mass Mm = 47f/e. The 
electric-magnetic Montonen-Olive duality corresponds to mirror- 
ing the lattice through the diagonal, interchanging the role of 
gauge particles and solitons and also interchanging e © 471/e; 
a strong-weak or S- duality. 


To illustrate the power and beauty of supersymmetry, we 
briefly discuss two further examples: one is the N = 4 
supersymmetric Yang-Mills theory, and the other is super- 
gravity which plays a vital role in modern superstring the- 
ory. 


N = 4 supersymmetric Yang-Mills. Let me briefly talk 
about a wonderful, somewhat exceptional class of models, 
which brings together a number of fundamental concepts 
that have been taking the stage in theoretical physics from 
the mid 1970s. The theories | am talking about are N = 4 
supersymmetric Yang-Mills theories. 


The marvel is that because of the M = 4 supersymme- 


try these theories are so constrained that their quantum 
behavior is well understood, even beyond the perturba- 
tive diagrammatic Feynman approach. This also implies 
that they exhibit an unusual kind of simplicity, which for the 
theorist makes them an ideal laboratory for testing novel 
ideas. It is for quantum field theorists what the roundworm 
C. Elegans is for geneticists so to speak. So it is not the 
theory on its own that is of particular relevance but its ex- 
traordinary properties are of interest 


Let us consider the simplest case where the gauge group 
is G = SU(2). The particle or field content of this M = 4 
gauge theory consist of a single spin-one super-multiplet 
that transforms as a triplet or vector representation of the 
SU(2). Because of the supersymmetry, one can generate 
a super-multiplet by acting with the supersymmetry gener- 
ators. The fields have the following spin content: there 
is one spin-1 field (these are the gauge bosons of the 
theory), there are four spin-4 and six spin-0, scalar and 
pseudo scalar fields. All of them transform in the triplet 
representation of the gauge group. 


The scalar fields act like a kind of Higgs field and break the 
SU(2) gauge symmetry to U(1), which we call electromag- 
netism in analogy with the electro-weak theory. Because 
of the symmetry breaking two things happen: 

(i) the gauge bosons W= acquire a mass mw = ef, and 
W° = y is the massless U(1) ‘photon’. The parameter f 
has dimension [mass] and sets the scale of the breaking. 
There is also a neutral massless scalar particle that sur- 
vives in the breaking. 

(ii) this theory has non-trivial classical soliton solutions, 
corresponding to the so-called ’t Hooft-Polyakov magnetic 
monopoles. These are regular, finite energy classical field 
configurations that are stable for a topological reason, im- 
plying that magnetic charge is also strictly conserved. The 
monopoles M= have a magnetic charge g = +471/e (twice 
the minimally allowed Dirac value) and have a mass (= en- 
ergy of the classical field configuration) equal to my = 
gf = 4nf/e. Note that these magnetic monopoles are a 
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necessary ingredient of this theory, and you are not free 
to leave them out. They represent magnetically charged 
‘particle like’ objects in the theory and what one may show 
is that upon quantization these monopoles also form spin- 
one supersymmetric representations. 


Electromagnetic duality regained. What is striking in this 
theory is that on the quantum level it exhibits a dual sym- 
metry between the electric and magnetic sectors of this 
theory. This is the non-abelian analog of the electric-mag- 
netic duality of the source-free Maxwell equations that we 
mentioned in Chapter I.1, and is called the Montonen-Olive 
duality. We have depicted the duality transformation on 
the electric-magnetic charge lattice in Figure 1.4.47, which 
shows the electrically charged gauge bosons W= and the 
charge neutral (self-dual) photon in the origin, as well as 
the magnetic monopoles M= on the vertical axis. The 
spectrum also allows for dually charged sectors called dy- 
onic labeled D(n,m). This remarkable symmetry is a strong- 
weak or so-called S-duality. Indeed if we take the elec- 
tric coupling weak (e < 1), then the magnetic coupling is 
strong (g = 47/e > 1). So, this theory is like the pure 
Maxwell theory self-dual; it maps one to one onto itself 
under the duality transformation. The upshot is that we 
have two fully equivalent formulations of the same phys- 
ics, one as the standard ‘electric’ gauge theory with mas- 
sive W~-bosons, a massless photon, and gauge coupling 
e, and the other as a ‘magnetic’ gauge theory with gauge 
bosons M=, a massless photon and a gauge coupling 


g~ l/e. 


Imagine what this means, if you turn up the coupling pa- 
rameter e then you expect the strongly coupled theory to 
no longer be controllable and predictable. But in this case 
we have an alternative, not an alternative reality because 
there is only one reality, but an alternative perspective or 
description where that would-be violent and uncontrollable 
reality is very well behaved, completely calculable and pre- 
dictable. 


This special property derives from the fact that the theory 
is not only supersymmetric but also has confomal symme- 
try. This implies that the charges do not renormalize, and 
they do not develop a momentum dependence, like in the 
case of ‘asymptotic freedom’ of Figure 1.4.40. The fact that 
the coupling constant has no dependence on momentum 
or distance means that this quantum theory is scale invari- 
ant (r — Ar) and in fact conformally invariant because it is 
also invariant under inversion (r — 1/1). It is a supercon- 
formal gauge theory. 


There is one more point about this superconformal gauge 
theory which makes it even more exceptional. Remember 
that we mentioned that in addition to the massless photon 
we have also a massless scalar particle in the theory. This 
particle mediates an attractive force between the other par- 
ticles with a coupling strength equal to the gauge coupling 
(the only coupling constant in the theory). Imagine we 
have two identical monopoles then we expect there to be 
a Coulomb repulsion due to the photon, but now there is 
the attractive scalar force which is exactly equal but op- 
posite. And as you may have guessed, these two forces 
cancel each other out and that is truly remarkable. So, if 
you bring two monopoles together very slowly, they don’t 
feel any force pushing them apart. The mass of a multiply 
charged monopole with charge mg scales exactly linearly: 
Ming = mMMg. This implies that also the masses are not 
renormalized, and the classical mass formulas turn out to 
be exact. But adding a monopole with opposite magnetic 
charge is another story, because now the two forces add 
and the anti-pole feels an attractive force that is twice as 
strong. It is an unstable configuration, a monopole anti- 
monopole pair would annihilate and be converted into pure 
energy. And by the way for the charged particles like the 
W* the same story holds. 


So, that’s the marvel: a supersymmetric , gauge and con- 
formally invariant quantum field theory! A remarkable out- 
lier, and indeed, some theorist feel tempted to quote Dirac’s 
1931 monopole paper, saying ‘One would be surprised if 
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Nature wouldn’t have made use of it. 


We will see that if not nature, then at least the string the- 
orists have made use of it in a marvel called the AdS/CFT 
holographic correspondence that we will discuss later on. 


Local supersymmetry: supergravity. A first important 
and profound generalization of Einstein’s theory is a the- 
ory called supergravity, proposed in 1976 by Daniel Fried- 
man, Sergio Ferrara and Peter van Nieuwenhuizen. Su- 
pergravity theories are invariant under local supersymme- 
try (or extended supersymmetry) transformations, and this 
means that the supersymmetry is gauged. It contains the 
Einstein theory but in addition to the graviton it predicts 
the existence of a fermionic partner called the gravitino 
with spin 3/2. These ideas have been worked out and ex- 
tended in great detail ever since by a sizeable community 
of devoted theoretical physicists. Extended supergravity 
theories would also encompass gauge symmetries of the 
Grand Unified type and were considered as candidates for 
a Theory of Everything. The maximally extended super- 
gravity in four dimensions, which features 8 supercharges, 
is related to a unique supergravity theory in 11 dimensions, 
which was constructed by Eugéne Cremmer, Bernard Julia 
and Joél Scherk working at the Ecole Normale Supérieure 
in Paris. The non-gauged M = 8 theory in four dimen- 
sions can be obtained from the eleven dimensional one by 
compactifying seven dimensions on a seven-dimensional 
torus. 


The sobering fact is that there was no support from the 
phenomenological side (no super symmetric partners ever 
showed up in experiments), and there is a myriad of extra 
particles that have to be accommodated (or better, elim- 
inated) somehow. Moreover, the ultraviolet behavior of 
these theories of gravity kept causing problems. It turned 
out that they are not renormalizable, because of unwanted 
infinities that kept showing up in certain calculations. And 
this was resolved until much later, when around 1995 it 
was recognized that supergravity was the low energy ap- 


proximation to a theory called M-theory living in eleven 
dimensions. This Meta theory, is the Mother of all ten-- 
dimensional superstring theories which we will talk about 
shortly. 


A Theory of Everything? 


Even if there is only one possible unified theory, 
it is just a set of rules and equations. What is it 
that breaths fire into the equations and makes a 
universe for them to describe? The usual approach 
of science of constructing a mathematical model 
cannot answer the questions of why there should 
be a universe for the model to describe. Why does 
the universe go to all the bother of existing? 
Stephen Hawking, A Brief History of Time (1988) 


Let us recapitulate the big steps we have discussed in this 
chapter: we started with classical particles and classical 
fields like the electromagnetic field. Then we introduced 
the quantum theory, where we described basically a sin- 
gle particle in a fixed external force field and that produced 
an extremely successful model for the atom with electrons 
in orbits around the nucleus. Then we moved on to include 
the kinematics of special relativity and that brought us to 
quantum field theory where the distinction between force 
fields and particles was lifted, since both are described by 
quantum fields whose spectrum consists of states with an 
arbitrary number of particles of the type described by that 
field. This program culminated in the highly unified Stan- 
dard Model. In their quest for an all-overarching Theory 
of Everything (TOE), that would also include gravity, the 
physicists took one step further and started moving in var- 
ious directions, all of which led up to the study of super- 
strings. 


What if ....2 The unification Figure 1.4.45 at least suggests 
that a Theory of Everything is certainly not excluded. Hith- 
erto a physical or logical veto that would prohibit such an 
overarching theory has not been disclosed. The term ‘ev- 
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erything’ is unfortunate, because it is not the pretension 
that such a theory would explain all observable physical 
phenomena, rather it would specify all the necessary and 
sufficient ingredients on a fundamental level, which would 
suffice to make a universe like ours. Still, you might won- 
der what it means, if such a Theory of Everything (or TOE) 
exists. 


Most people think of it as a set of principles from which 
everything we do and do not know about nature would 
uniquely follow. We would take nothing for granted, not 
the existence of light nor certain particles, or even the ex- 
istence of space and time. But maybe the starting point 
could be the notion of energy or information or of oberserv- 
ability. As | said, everything is a lot, and practitioners aim 
a little lower. A TOE marks the end point of the quest for 
ever more fundamental and basic building blocks that are 
the subject of this chapter. 


The discovery of a (or should | say the) TOE would mark 
the closure of basic physics. This would be both an impres- 
sive and a surprising achievement. Nature would have a 
true bottom so to speak. Yet, from a practical point of view 
such a completion is not such a big deal really. It proba- 
bly would make physics a more boring place to be. Par- 
ticle physics would at best become some kind of tourist 
trap, which one might want to avoid because the interest- 
ing characters lived there a long time ago. A monument 
for intelligence! And beauty, yes of course! Lots to admire 
and enjoy. But adventure? Alas, no! 


But now | am talking like the physicists at the end of the 
nineteenth century who thought that the completion of ba- 
sic physics was imminent. And it certainly was not! Quite 
the opposite, the twentieth century turned out to be one 
of the most revolutionary, inspiring and successful eras in 
physics ever. A century of relativity (geometry), of informa- 
tion and of quantum, as we argued in Chapter 1.2. 


Around 1980 it became clear that theories of 1-dimensional 


extended objects called strings provided a drastically dif- 
ferent perspective on the problem of gravity. They have 
been center stage from 1984 onwards, but these theories 
have so far not been able to impress with resolving exist- 
ing problems or with predictions that were confirmed by 
experiment. The relevance of string theory as a theoreti- 
cal laboratory is fully recognized as a powerful extension 
of quantum field theory, and it has helped us to understand 
such elusive concepts as quantum black holes and quan- 
tum phase transitions. And superstring theory keeps alive 
the hope for a Theory of Everything, a Holy Grail of parti- 
cle physics. Therefore we will conclude this chapter with 
a section on this topic which is still very much in a state 
of flux. As will become clear once more: beauty has its 
price. 


Superstrings 


And so we face a contradiction between quantum 
field theory and general relativity similar to the con- 
tradictions that led to quantum mechanics. Many 
physicists believe that this contradiction contains 
the seeds of an upheaval as profound in its own 
way as the discovery of quantum mechanics and 
relativity 

Edward Witten, Nature (1996) 


String theories in their present formulation are quantum 
theories of extended objects, like strings, and (mem)bran- 
es of different dimensions. Mathematical consistency of 
the this theory requires two conditions to be fulfilled, (i) the 
theory should be supersymmetric, and (ii) the theory lives 
in ten or eleven dimensions. A closed formulation of the 
theory that may exist in eleven dimensions, and for some 
mysterious reason is called M-theory, is not available, but 
a small set of ten-dimensional limiting descriptions of that 
theory are known, and these correspond to the five differ- 
ent superstring theories. 
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Figure 1.4.48: String worlds. What the world might look like at 
x103 m. 


Just like a single quantum field describes an arbitrary num- 
ber of particles of a given type, the basic property of the 
fundamental (super)string is that it describes an infinite 
number of fields, or particle types! Most of them corre- 
spond to extremely massive particles that cannot be pro- 
duced in our accelerators. Crucial for phenomenology and 
falsifiability are the ‘massless’ fields that the theory pre- 
dicts. 

The one outstanding fact is that the theory includes grav- 
ity. The gravitational field obeys an equation to which Ein- 
stein’s equations are an approximation. This makes the 
theory a serious candidate for a quantum theory of all fun- 
damental particles and interactions including gravity. An 
important — quantessential — step forward, but many hur- 
dies still have to be overcome. Most importantly, it is still 
not known how the beloved Standard Model fits in, though 
all the ingredients appear to be there. The problem so far 
is that the theory describes more than we need. 


Understanding gravity. Our understanding and interpre- 
tation of gravity has through history made dramatic turns. 
Of course it started with the idea of a force leading to the 


whole Newtonian dynamical framework including his ‘uni- 
versal law of gravitation’. The second grand turning point 
came with Einstein’s theory of General Relativity, where 
it was shown that the gravitational force was just a mani- 
festation of the curvature of space-time. Further searches 
were driven by the strongly perceived necessity to bridge 
the gap between quantum theory and general relativity. 


This turned into the elaborate field-theoretical edifice of 
supergravity in all its diversity. That approach turned out 
to have serious shortcomings and at some point seemed 
doomed, but then it gave way to superstring theory in which 
supergravity again found a safe haven. 


String theory, as a possible overarching quantum theory of 
all interactions including gravity, has passed through some 
major revolutions after its inception dating from the early 
1970s. It is customary to distinguish three eras of super- 
string theory: 


1st era (...-> 1984): String theory as an attempt to de- 
scribe the strong interactions (Veneziano, ...). 

2nd era (1984-1995): Superstring theory as a theory of 
quantum gravity (Scherk et al, Schwartz, Green, Witten, 
as) 

3rd era (1995-present): Extended objects or D-branes, 
M-theory and the holographic AdS/CFT correspondence 
(Polchinski, Witten, °t Hooft, Susskind, Maldacena, Stro- 
minger, Vafa,...). 


We see that string theory, as a would-be unified theory of 
all fundamental interactions including gravity, was launched 
in 1984. In that theory the gravitational field equations are 
derived from imposing conformal invariance on the under- 
lying string degrees of freedom that live on the world-sheet 
of the string. In that perspective gravity is an effective long- 
distance description of an underlying string dynamics and 
in that sense is an emergent phenomena. 


However, in spite of its intrinsic beauty and elegance, string 
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theory did not quite deliver. It forced us to accept many 
hard to swallow extras like supersymmetry, a 10-dimen- 
sional space-time, and quite some extra degrees of free- 
dom, of which no hint showed up in experiments. A del- 
uge of extra degrees of freedom that ‘nobody had ordered’ 
so to speak. But moreover it appeared that string theory 
had no direct answers in store concerning very important 
questions like how to treat realistic black holes quantum 
mechanically, and how to address the even more urgent 
questions concerning the direct experimental evidence for 
dark matter and dark energy. 


And that takes us to the present engagement of string the- 
ory, in particular with the idea of holography which culmi- 
nated in Juan Maldecena’s rather stunning Anti-de-Sitter/- 
Conformal-Field-Theory (AdS/CFT) correspondence. This 
radical proposal was published in 1997 in a paper which is 
considered one of the most influential of the present era. 
It is often referred to as the gauge/gravity duality or Malda- 
cena duality. 


The gauge/gravity duality refers to a rather specific setting 
of the Anti de Sitter space-time (in various dimensions), but 
suggests a profound and generic aspect of string theory. 
The canonical example refers to the situation of string the- 
ory in the 5-dimensional Anti de Sitter (AdS) space-time. 
This space-time has a cosmic boundary, which is a flat 
4-dimensional Minkowski space-time. On that boundary 
lives a four-dimensional conformal quantum field theory 
(CFT), which is a large N copy of the M = 4 SU(N) gauge 
theory that we discussed in the previous section. The du- 
ality says that the full string theory in the AdS background 
is exactly dual to the CFT on the boundary. Thus the the- 
ories are fully equivalent; they describe the same physical 
reality in two different perspectives. So for example if we 
have the formation and subsequent evaporation of a black 
hole in the Anti de Sitter universe, this process could be 
completely understood as some unitary time evolution in 
that boundary conformal quantum field theory. 


If you want, you can read the AdS/CFT correspondence 
in an even more - literally — ‘outlandish’ way, namely, that 
gravity and space-time are elevated to a holographic illu- 
sion! If we know everything about the conformal theory on 
the d-dimensional boundary, we would be able to recon- 
struct all conceivable (gravitational) physics in the (d+1)- 
dimensional space. This prompts the interpretation that 
gravity as such doesn’t really exist as a fundamental force. 
How elusive can reality be? If it doesn’t really exist, then 
it certainly wouldn’t have to be quantized. The quantum 
behaviour is emulated in a quantum field theory living on 
the boundary of space-time. Let me paraphrase this ironic 
state of the universe as an ironic state of mind: or we are 
an illusion, or the theory that claims that we are an illusion 
is an illusion. 


Strings: all fields in one? 


What is a string? Let us start with the most elementary 
type of string which directly connects with our intuition. A 
string is like an idealized one-dimensional tiny piece of a 
rubber band that moves through ordinary space and time. 
The motion of a string can be broken down into the motion 
of its center of mass, and a relative internal motion. For 
closed strings the relative motion corresponds to waves 
moving in either direction along the string. But you can 
also have open strings that have to satisfy certain bound- 
ary conditions, which basically say that its endpoints have 
to move with the velocity of light or that they have to be at- 
tached to some higher dimensional physical object called 
a D-(mem)brane. These boundary conditions ensure that 
the string has a certain tension which is an energy per unit 
length. This tension makes these strings very much like 
the strings on a violin that have oscillatory modes, known 
as standing waves that correspond to its basic, harmonic 
overtones. 


It is not hard to imagine how a string model is supposed to 
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represent particles, if we are far away — we cannot resolve 
the internal structure of the string, and we see the string as 
a point-like object with a certain mass and momentum and 
there may be other internal quantum numbers like charge 
or spin. Therefore strings manifests themselves as parti- 
cles at large scales and low energies. The apparent mass 
of that particle corresponds to the energy of the internal 
(relative) oscillations of the string. 


E2 = (PoO™)? +. Fy (pret)? ~ P2+.X,mz. (1.4.38) 


Clearly, the different oscillatory modes (labeled by k) have 
to correspond to different particle species, but the mass 
scale of the masses mą would be humongous. If the string 
is tiny, say of the order of the Planck length, then its in- 
ternal modes are extremely hard to excite. You need an 
energy of the order of the Planck mass which we intro- 
duced in Chapter 1.3, i.e. mp ~ 10!” proton masses! We 
have to conclude that all the particles we know and love 
should correspond to different modes in the lowest energy 
or massless sector of the string. It is here that the su- 
perstring is important because it has a huge internal sym- 
metry group which means that the massless sector is also 
extended, containing all spin values starting at two (the 
graviton) all the way down to zero. The zero mass sec- 
tor of superstrings corresponds to the particle content of 
certain supergravity theories. 


The take home message at this point is merely that a sin- 
gle string carries an infinite number of different particle de- 
grees of freedom, of which only the massless sector is of 
phenomenological importance. So, one type of superstring 
may represent all different particle types and their super- 
partners as we have indicated in Figure 1.4.49. So you 
should think of all the fields related to the particle types 
we have been discussing previously, corresponding to dif- 
ferent modes of a single type of superstring. The higher 
mass modes are crucial to ensure that the theory is math- 
ematically consistent, they help in making the theory well 
behaved at high energies. And that makes sensible calcu- 
lations on the quantum level possible. 


Ordinary Particles Super Particles 


w Graviton Superstring Gravitino 
J Photino 
< Gluons < Gluino 


e) Wino* , Zino 


Figure 1.4.49: Superstrings. All known particle types plus many 
more such as the superpartners or sparticles should correspond 
to different lowest energy modes of a superstring. These par- 
ticle types were already ingredients of the earlier supergravity 
theories. 


The world-sheet. If a point particle moves through space- 
time, we call its trajectory a world-line. Similarly if a string 
moves along in space-time, it traces out a two-dimensio- 
nal surface embedded in space-time, a surface which is 
called a world-sheet. The world-sheet has one space-like 
dimension along the string and one time-like dimension to 
allow the propagation of the string. 


There are two related geometries in the formulation of string 
theory: one is the two-dimensional intrinsic geometry of 
the world-sheet and the other is the geometry of the back- 
ground space-time also called the target-space in which 
the string is moving. The world-sheet is parametrized by its 
space- and time-like coordinate (o, Tt) and its geometry is 
determined by a world-sheet metric gag (0, Tų). This world- 
sheet is embedded in a ten-dimensional space-time with 
coordinates (X"; u = 0,..., 9) with its own metric ari) : 
The world-sheet is therefore described by its embedding, 
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X!” = X#(o,T), that specifies its position in space and 
time. If you give me a point (09, To) on the world-sheet, the 
embedding yields a corresponding point x = X"#(00, To) 
in space-time. A three-dimensional impression of what 
that looks like is given in Figure 1.4.51. The embedding 
provides a relation between the two geometries in the sense 
that the embedding induces a metric on the world-sheet 
from the space-time metric. Just like we can construct the 
metric on the surface of a sphere by inducing it from the 
metric of the R? in which we embedded the sphere, as we 
did in Chapter 1.2. 


The modelling of a string propagating in space-time in- 
volves an action (or Hamiltonian) that has all the required 
invariances and couplings, and loosely speaking it corre- 
sponds to the ‘area’ of world-sheet. This is not too sur- 
prising if you remember that the string has a tension and 
therefore wants to minimize its length and therefore energy 
(energy = length x tension). 


What | want to get across here is that the expression for 
the ‘area’ of the world sheet involves the induced metric 
on the world-sheet which is an expression that in turn de- 
pends on the o and t derivatives of the space-time coor- 
dinates X"(o, 71). What this means is that in this formula- 
tion of string theory, the string dynamics is like a quantum 
field theory defined on the world-sheet, where the space 
time codrdinates X" play the role of a set of (d + 1) scalar 
fields. So, yes, we are indeed quantizing space-time in 
the sense that we quantize the coordinates. For super- 
strings the story is similar, a Superstring moves in super- 
space which has also fermionic coordinates, and those 
provide fermionic field degrees of freedom on the world- 
sheet. 


The string action has to be a scalar quantity and therefore 
will also involve the space-time metric guy , which depends 
on the space-time coordinates and makes the action highly 
nonlinear in the scalar fields. But let us for a moment 
assume we study the string in flat space-time then with 


Suv = Nyy is constant. Then the action will be invariant un- 
der space-time translations and Lorentz-transformations, 
and therefore we expect that the spectrum of the theory 
will reflect that and can be interpreted as representations 
of the Lorentz group and these label the space-time fields 
that the string theory produces. And that is for example 
how the graviton, as a massless spin-two representation, 
shows up in the spectrum of the string. 


Background dependence. Siring theory goes fundamen- 
tally beyond General Relativity, because according to this 
theory space-time itself is supposedly made up of strings. 
In the actual formulation of superstring theory we have to 
deal with this paradox that on the one hand the strings 
propagate in a given background space-time, and on the 


other the actual background space-time is made up of strings. 


The background should be the outcome of the theory, it 
has to be predicted. This leads to certain consistency re- 
quirements. Space-time, as we experience it, is a man- 
ifestation of the collective behavior of strings, a kind of 
background or ground state. It would imply that space and 
maybe even time are ‘emergent’, an idea that would have 
tremendous philosophical implications as well. 


In the massless sector of superstring theory we also find 
spin-one-half fermionic constituent particles, as well as the 
known spin-one force fields. Moreover these fields would 
couple in the correct way because the gauge symmetry 
principles that underly both the Standard Model and Ein- 
steins gravity theory are naturally built into string theory. 
In a sense, string models have too much symmetry and 
therefore predict many extra particle types. We do not 
want these particles because we do not see them in na- 
ture; this implies that certain sectors of the theory have to 
be suppressed or even removed. To do that in a consistent 
way is a challenge. 


As | have mentioned, for string theory to make sense two 
strong theoretical constraints have to be met. Number one 
is the existence of supersymmetry and number two a strin- 
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gent condition on the dimensionality of space-time. For su- 
perstrings the space-time dimension is ten. Indeed these 
conditions seem quite unnatural and make you worry, be- 
cause neither supersymmetry nor ten-dimensional space- 
time have been observed. Yet from a mathematical consis- 
tency point of view there is no doubt about the necessity 
of imposing them from the start. 


String interactions. In field theory the basic interactions 
are represented by interaction vertices where three or more 
particle lines meet and there is a coupling constant asso- 
ciated with each vertex. As strings represent all particle 
types their interactions should somehow take care of all 
particle interactions. In Figure 1.4.50 it is clear that for the 
closed strings there is only one type of interaction vertex 
which corresponds to the joining and splitting of two strings 
and henceforth there is only one string coupling constant, 
called A. The external legs of the string diagram, which 
represent the incoming and outgoing particles, are taken 
to be in the desired particle modes. This is how the dif- 
ferent modes get coupled together and the complicated 
bookkeeping of labels is in some sense implicitly done by 
simply drawing the corresponding string diagram. A higher 
order string diagram with a number of incoming and out- 
going strings is depicted in Figure 1.4.51. 


String quantization. 


Optimal paths. If you use Google maps it helps you find the 
shortest route. It offers you the choice between the short- 
est route in distance or the shortest route in time. What 
do particles do? If we take a photon, it will move from A 
to B along a path of minimal action, but what does that 
mean? The photon will certainly move along a straight line 
but that is the shortest in both time and distance. To re- 
ally find out we have to do one more step. We know that 
in a medium like water or glass light moves slower than in 
vacuum or air. So, if A is under water and B above water 
the photons do not move along a straight line from A to B, 
they take a route that consists of two straight sections that 


Figure 1.4.50: String interactions. Strings have only one fun- 
damental interaction vertex consisting of breaking or joining 
strings. So all different particle interaction diagrams of a given 
topology can be represented by a single string diagram. 


make an angle at the surface. This is the problem we dis- 
cussed in detail at the end of Chapter 1.1 on page 18. The 
angle depends on the refraction indices in the two media, 
which are inverses of their velocities. The path that takes 
the shortest time is not the straight line from A to B, but 
rather a line that is broken at the surface. So the classical 
trajectories are optimal in the sense that they are minimal 
action trajectories, they correspond to local minima of the 
action. 


That raises the interesting question that all school kids ask: 
How does a photon know which path to take? It can’t do 
the necessary calculations, can it? No it can’t! So it does 
not use Google’s algorithm, which amounts to calculating 
most of the nearby paths in a restricted domain and choos- 
ing the optimal one. The photon, being a quantum parti- 
cle does in fact a quantum computation it takes all paths 
simultaneously let them interfere and what comes out is 
weighted sum over possible paths the photon could have 
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Figure 1.4.51: Joining and splitting strings. Artist impression 
of strings moving in space-time and interacting by joining and 
splitting. It illustrates the geometric nature of string interactions. 
This string world-sheet has holes (handles) and closed bound- 
aries which represent the incoming and outgoing closed strings. 


taken. In short the photon performs a path integral. 


The path integral. \f we make the step to the quantum 
description of the particle, then we have to include all pos- 
sible paths from A to B. The quantum propagation is now 
a weighted sum over paths, which is the famous Feynman 
path-integral. The probability amplitude to go from A to B 
is a superposition of amplitudes, the contribution of each 
path is weighted by an exponential phase factor where the 
phase is exactly the classical action of the path divided by 
A. 


So indeed in quantum mechanics these contributions can 
reinforce each other if they are in phase at B or dampen 
each other if they are out of phase; quantum particles in- 
terfere with themselves! 


Going Euclidean. Here is another interesting aspect of 
path-integrals. In general they involve paths or config- 


urations in space-time, which has a Lorentzian and not 
a Euclidean signature. However, what physicist often do 
when calculating or defining these integrals is to ‘deform’ 
them to Euclidean space, hence the term Euclidean path- 
integral. We calculate the Euclidean action of the paths 
and those with high action are exponentially suppressed 
with an exponent that equals minus the action divided by h. 
One interesting consequence of this is that if we take the 
Euclidian action of a (d+1) dimensional physical system, 
we are summing over spatial configurations in a (d+1)- 
dimensional Euclidean space. But, as we argued in Chap- 
ter 1.1 this is very much like doing statistical mechanics 
in (d+1) dimensions, where we for example calculated the 
partition function of the system as a sum of all possible 
configurations weighted by the Boltzmann factor which was 
also an exponential of minus the energy divided by kT. 
This correspondence expresses a profound relation be- 
tween calculations in quantum field theory in d spatial and 
1 time dimensions and calculations in statistical physics in 
(d+1) spatial dimensions. We will return to this connection 
in Chapter III.4. 


From particles to strings. | tell you this particle story, be- 
cause it helps to understand how string theory can be for- 
mulated as a generalization of a theory of point particles 
to a theory of one-dimensional extended objects. A clas- 
sical path would typically correspond to a minimal action 
configuration of the world-sheet which corresponds to an 
extremal (‘optimal’) area of the word-sheet. The Euclid- 
ean equivalent for a closed string world sheet would be a 
soap bubble surface between two solid rings. And we know 
that a real soap bubble chooses the ‘minimal energy’ sur- 
face. It also showed what the fluctuations about the min- 
imal energy configuration (the straight cylinder) look like: 
the string moves at intermediate times about its equilibrium 
position. There could also be wiggles running transversely 
— meaning along the string — but those have higher en- 
ergy and unfortunately could not be excited with the soap- 
bubble kit | gave my daughter for her birthday in a failed 
attempt to make her study physics a long time ago. 


212 CHAPTER I.4. THE QUEST FOR BASIC BUILDING BLOCKS 


Figure 1.4.52: String propagator. On the left the classical prop- 
agation of a string in space-time corresponding to the minimal 
action world-sheet bounded by the initial and final state. On 
the right the quantum propagation as a weighted sum over all 
string world sheet configurations bounded by the initial and final 
states. On the left there is a unique intermediate state, while on 
the right we have a superposition of many. 


Astring amplitude implies summing over all possible world- 
sheets satisfying the appropriate boundary conditions. This 
is illustrated in Figure 1.4.52, where on the right we have a 
superposition of configurations that contribute to the prop- 
agation from the initial (left) to the final (right) configuration. 
The action is basically the area, which depends on the 
metric on the world-sheet. The problem then boils down 
to the construction of a correct and well-defined measure 
for doing this integral over the ‘space of all metrics’, with- 
out leaving metrics out, but at the same time not over- 
counting. This is a complicated mathematical problem be- 
cause of the huge symmetries in the problem. 


This basically concludes my ham-handed introduction to 
the quantization of strings including small world-sheet fluc- 
tuations. 


Figure 1.4.53: Euclidean world-sheet. Woman keeping up the 
Euclidean appearance of ‘vacuum bubble,’ a contribution to the 
vacuum-to-vacuum amplitude for a closed string. (Source: Ate- 
lier bulles geantes) 


Weakly and strongly coupled strings. An important ques- 
tion at this point is, what are the world-sheet configura- 
tions that matter most, and will dominate the path-integral. 
In the figure just mentioned | have clearly limited myself 
to rather small fluctuations around minimal area classical 
configuration. So this is what a cheap navigator in your 
car would do, it misses out on surprising not so obvious 
shortcuts. You see for example that the topology of the 
world-sheets | included are all of the trivial cylindrical type. 
| have apparently not allowed for string interactions, mean- 
ing splitting and joining of tubes, and creating holes in the 
world sheet, like we depicted in Figure 1.4.53. What this 
basically means is that | have assumed that the string in- 
teractions are weak. The stronger the string interaction 
are, the easier (more probable) the excitation these com- 
plicated surfaces of high genus will be. 


A full string amplitude requires summing over all possible 
world-sheets that satisfy the appropriate boundary condi- 
tions, which means that you end up with a sum over genera 
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(the number of holes) that label the topological class of the 
world sheet, and for any given genus you have to sum over 
all compatible metrics. 


So as a relevant example let us consider the vacuum-to-- 
vacuum amplitude for closed strings, This involves sum- 
ming over all closed (no boundary) two-dimensional sur- 
faces of arbitrary genus: these are called Riemann sur- 
faces. They are embedded in (d+1)-dimensional Euclid- 
ean space and weighted by the negative exponential of 
their area. One such configuration with some holes fea- 
tures in Figure 1.4.53 .8 So this particular amplitude is now 
equivalent to an interesting problem in statistical physics: 
the calculation of the partition sum of random 2-dimensional 
surfaces in a d-dimensional space. 


What these considerations make clear is that our naive in- 
tuitions about string theory really only apply to the case of 
weak string coupling, and nobody knows what a strongly 
coupled string theorie actually means. That is to say, up 
to about 1995 nobody understood, but after the so-called 
second string revolution in the present era of string the- 
ory, we know much better what’s going on. We will discuss 
some of this shortly. 


Five superstring theories. You would maybe hope that 
with such outlandish requirements string theory would be 
highly unique, after all how could a Theory of Everything 
not be unique! But this appeared not to be the case. There 
are five different superstring theories in ten dimensions, 
which differ very much by the symmetries they have. Let 
us list these theories without further going into detail about 
their specific features: we distinguish: Type I, Type IIA, 
Type IIB, Heterotic E8 x E8, and Heterotic SO(32). We 
have depicted these superstring theories and how they 
are connected to each other and with M-theory, to be dis- 
cussed shortly, in Figure 1.4.60. 


8A not-so-nice colleague suggested that this was part of a job appli- 
cation ceremony. 


So, either these theories are wrong, or we have to work 
very hard to understand how our not-so-supersymmetric, 
not-so-ten-dimensional world can be interpreted as a not- 
so-simple solution to the equations that govern string the- 
ory. In that sense the theory does make very strong pre- 
dictions, which at least in principle are falsifiable. This im- 
plies that we cannot just go out and do a decisive exper- 
iment, however, Predictions that have spent ages in wait- 
ing rooms are not uncommon in science, and this now also 
applies a fortiori to the very fundamentals of the quantum 
gravity world. 


So far we have discussed (extended) supersymmetry as 
a distinguishing feature of supersymmetric gauge theories 
and super gravity theories. The next question is what the 
additional requirement that space-time be ten-dimensional 
means. 


Ten-dimensional space-time? The second consistency 
condition of string theory is that space-time has to be 10- 
dimensional; this is in strong contrast to the 4-dimensional 
version we are all familiar with. At first this requirement 
sounds too outrageous to be true, and is all day convinc- 
ingly falsified by our daily experience! But for theorists 
nothing is unsurmountable, they bear in mind Einstein’s 
consoling words: ‘Subtle is the Lord but malicious He is 
not. And they have cooked up scenarios of how to get 
rid of six of those ten dimensions by a procedure called 
compactification. This amounts to tightly ‘rolling up’ the 
extra dimensions into a variety of compact manifolds like 
spheres or tori or combinations thereof. 


Kaluza — Klein theory. This way of effectively reducing the 
dimension of space has a long history, and goes back to 
the first quarter of the twentieth century. Theodor Kaluza 
and Oskar Klein independently proposed a geometrical 
unification of the gravitational and electromagnetic forces, 
by looking at General Relativity in five dimensions, where 
they assumed that the fifth dimension would be curled up 
into a tiny circle. Interestingly, the extra components of 
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Figure 1.4.54: Compactification. Here we give an idea how 
one spatial dimension can be compactified to a circle. The ro- 
tational U(1) symmetry of the circle results in the existence of 
U(1) gauge field in the large dimension. The relative coupling to 
the matter fields is inversely proportional to the radius R of the 
internal circle. In this perspective taking a weak coupling limit is 
like opening up an extra dimension (R — oo). This is the orig- 
inal dimensional compactification scheme of Kaluza and Klein, 
which plays also a role in going from 11-dimensional M-theory 
to 10-dimensional superstring theories. 


the curvature tensor you get in going from four to five di- 
mension would correspond to the degrees of freedom of 
the electromagnetic field in four dimensions (and an extra 
scalar field). The extra components of the metric would be 
‘dus = g5. These generically correspond to the gauge 
potential A,,, with gs5 being an extra scalar field. Further- 
more, the whole 5-dimensional system of Einstein’s equa- 
tions, after compactification, correctly reproduces the cou- 
pled Maxwell-Einstein equations in four dimensions. The 
momentum component in the fifth dimension of a moving 
particle basically corresponds to its electric charge. Ein- 
stein actually liked the idea but it was hard to reconcile 
with quantum theory and therefore slid into decline. 


In science, attractive ideas that don’t quite work can be 


safely stored away in a kind of fridge. This fridge consists 
foremost of the collective memory of the scientific com- 
munity, and the written records of course. Ideas can hi- 
bernate for years or even for a century or so, before get- 
ting rediscovered and making a glamorous come-back in 
a novel context. Compactifying dimensions a la K-K is 
such an idea. The extra dimensions would probably be too 
tiny to see with present-day accelerators. To probe such 
small sizes you need correspondingly small wavelengths, 
which mean very high energies. So in that way you escape 
the manifest presence of those dimensions. But there is 
one feature that would be manifest at low energies. If the 
compactified space has symmetries — and it usually has — 
those symmetries after quantization give rise to massless 
particles that would clearly manifest themselves, also at 
low energies. It is precisely the rotational symmetry of the 
circle geometry of the fifth dimension that generates the 
massless photon in the Kaluza-Klein scenario! 


Suppose we compactify one dimension of space into a cir- 
cle then that has important consequences for the allowed 
states of a quantum particle. Remember that the spec- 
trum of the momentum or energies for a free particle in flat 
space is continuous, and the wavefunction for a fixed mo- 
mentum (p) state corresponds to a sinusoidal wave with 
the wavelength A =ħ/p. With the compactified dimension 
being a circle the particle momenta are quantized exactly 
as in the old Bohr model we discussed in Chapter 1.3 on 
page 134, because we have the periodicity condition that 
the wave has to fit on the circle: nA = 27R and there- 
fore (relativistic) En = pn ~ n/R. And in the original K-K 
model this component of the momentum is just the charge 
of the particle, that charge is thus quantized. The attentive 
reader now will ask whether here we have a model with- 
out monopoles where charge would be quantized. How 
come? The answer is that in this model topologically sta- 
ble monopole configurations do exist as solitons much like 
in the supersymmetric gauge theory we discussed before. 
So Dirac does not have to turn in his grave! 
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Figure 1.4.55: T-duality. The non-oscillating modes of a of 
(bosonic) closed string on a circle with compactification radius R. 
The energy E = Em + En > mR + z, where m is the (topolog- 
ical) winding number and n the momentum which is also quan- 
tized. We have drawn the vertical lines for fixed R, demonstrat- 
ing that the energy spectrum for radius R and 1/R are identical 
if we interchange m and n, making the spectra identical. Tak- 
ing the limit R — oo is decompactifying, which is like opening 
up an extra dimension. This means that the topological sector 
disappears and the momentum becomes continuous. 


T-duality. If we have a string theory and we compactify 
one dimension, something interesting happens with the 
states of a simple string in that dimension. Of course 
the string can oscillate, but we are not interested in those 
states at this point. We want to look at the zero-modes. 
The string can move around the circle and behave like a 
particle, and that gives the spectrum we just discussed 
En ~ p ~ n/R, but for a string there are distinct topo- 
logical sectors as the (closed) string can wind an inte- 
ger number m times around the circle with radius R. This 
gives a contribution to the energy of the string propor- 
tional to its length so that gives a topological contribution 
Em ~ mR. So, for a string we arrive (choosing appropriate 
units) at a simple formula for the energy spectrum of the 


Figure 1.4.56: Compactification, the story of six inner dimen- 
sions. Compactification means that space-time has four large 
and six compact or internal dimensions. Here we show two 
three-dimensional projections of possible six-dimensional com- 
pactifying spaces. It is evident that these so-called Calabi-Yau 
spaces have intricate geometries. (Source: Polytope24) 


non-oscillatory modes: E = En + En = n/R+ mR. This 
spectrum is remarkable because it has an exact symmetry 
under the inversion R — 1/R, as is shown graphically in 
Figure 1.4.55. 


And as the circle is part of the space-time, the target space 
in which the string moves, this symmetry or map is called 
‘T-duality’ or target space duality. Here we showed the ele- 
mentary example where the duality was actually a symme- 
try, a self-duality, but the duality as a map plays an impor- 
tant role in the mapping of different 10-dimensional super- 
string theories onto each other as we will see shortly. 


Note that we have encountered two types of duality: the 
first was called ‘S-duality’ or ‘strong-weak duality’ which 
may apply to both supersymmetric particle and string theo- 
ries. Secondly we have ‘T-duality’ or ‘target space duality’, 
which makes only sense for string theories. 
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Calabi-Yau compactifications. An interesting quite special 
class of compact six-dimensional spaces over which the 
superstring can be compactified is the class of 6-dimen- 
sional Calabi-Yau manifolds, used in the compactification 
from ten to four space-time dimensions. These spaces 
have a number of very special properties that we will not 
discuss here. In Figure 1.4.56 we exhibit two-dimensional 
views of three real dimensional cross-sections of these six- 
dimensional spaces. They are of interest for certain su- 
perstring theories, because they ensure that the resulting 
four-dimensional theory closely resembles a super exten- 
sion of the Standard Model. 


The multiverse. A weakness of the compactification sce- 
narios in string theory is that nobody has been able to 
show that compactifications — if any — would be gener- 
ated dynamically by this prospective Theory of Everything. 
This is a pressing issue because what looked like a unique 
and universal theory turns out to have an astronomical 
proliferation of conceivable compactifications. But to each 
compactification would correspond a different type of four- 
dimensional universe, with its own cosmic history, parti- 
cle content and set of forces, in short, its own Not-So- 
Standard Model! Some of these universes could collapse 
before anything interesting would happen. Others may ex- 
pand too fast for stars to form, let alone life to develop. In 
some of them there would be light, while in others none, 
or maybe many sorts of light. In fact a mind-boggling uni- 
verse of universes is opening up, which is called the multi- 
verse. 


String theory so far is a theory of many possible theories, 
a theory of a multiverse in which wildly differing types of 
physics could manifest themselves, even in parallel. And, 
yes, ours would be just one of them. This is quite or- 
thogonal to the basic motivation of most scientists who 
search for a unique universal theory of Nature. The com- 
mon prejudice used to be that the Theory of Everything 
would come up with our dear universe as the unique or 
at least strongly favoured solution. We like to think of our 


Figure 1.4.57: A multiverse? An artists impression of the mul- 
tiverse. A two-dimensional projection of a ten-dimensional com- 
pactification scheme. (Source: Forum Futura) 


world as the unique expression of the universal principles 
underlying that theory. It would have left the Creator but 
one option. It came somewhat as a shock that, after a 
promising start in that all-overarching direction, string the- 
ory moved in fact the opposite way. Maybe it is trying to 
tell us something contrary to our expectations, and for that 
science has excellent credentials. After all, the existence 
of a multiverse is yet another step away from our old an- 
thropocentric dream of us being (in) the center of THE uni- 
verse. THE universe? What are you talking about? Such 
a dramatic form of relativism would indeed constitute the 
ultimate irony of science, or of human existence. 


It is not by accident that a strong protagonist of the multi- 
verse, Leonard Susskind of Stanford University who wrote 
a popular book called The Multiverse about it, claims that 
the proper interpretation and main prediction of string the- 
ory is precisely that we live in a multiverse. The nasty as- 
pect of this view is that the existence and properties of our 
own universe become extremely hard to predict from such 
a premise and in that sense not much progress has been 
made. It is an example of where contingency and evo- 
lutionary thinking enters physics at the most fundamental 
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level. It is analogous but much worse than asking a physi- 
cist to precisely predict the number and sizes of the plan- 
etary orbits in our solar system from Newton’s laws. That 
can’t be done. Newton’s equations describe all sorts of 
planetary systems. And indeed, all observed systems do 
fit in his framework, but asking to predict them would be 
many bridges too far. And to be fair we don’t expect that 
because we know that the details of the solar system are 
the outcome of a highly contingent, non-universal histori- 
cal process and not simply calculable from first principles. 
To put it differently, all dogs are animals but not all animals 
are dogs, that’s the problem! The biological conundrum, 
we know what an animal is and what the species on Earth 
look like, but predicting them from single-cell organisms 
using genetics is somewhat harder. 


M-theory, D-branes and dualities 


It was believed for many years that there were five 
possible string theories, prompting the question: 
if one of these describes our universe, who lives 
in the other four worlds? But recently it has be- 
come clear that those five string theories are limit- 
ing cases of one majestic and mysterious theory. 
Edward Witten, Nature (1996) 


D-branes. Whether they liked it or not, string theorists 
discovered that strings are not enough to make a con- 
sistent quantum theory of gravity. In fact 11-dimensional 
supergravity had a somewhat uncomfortable ‘living apart 
together’ relation with the 10-dimensional string theories. 
This supposedly low energy approximation of string theory 
lived one dimension up and had features that were lack- 
ing in string theory. These were stable soliton like classi- 
cal solutions called branes, to be thought of as p-dimen- 
sional generalizations of membranes. This quite naturally 
prompted the question what the role of these p-branes in 
string theory would be. 


a graviton 
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Figure 1.4.58: D-branes and strings. A stack of three flat D2- 
branes. Open strings have to end on branes, by connecting 
them they represent nine U(3) gauge fields living on the branes. 
This figure is is reminiscent of Figure 1.4.37 with the bi-colored 
lines representing gluons propagating. Closed strings are not 
connected to branes and correspond to gravitons. 


Strings are one-dimensional extended objects, and indeed, 
a question that came up already early-on was: why if you 
give up the unique particle notion as the fundamental start- 
ing point, stop at one-dimensional extended objects? Why 
not include membranes and other higher dimensional ex- 
tended objects? It was part of the second string revolu- 
tion around 1995 that Joe Polchinski of the Kavli Institute 
for Theoretical Physics in Santa Barbara had the crucial 
insight that higher dimensional objects he called D-bran- 
es had to be included indeed. He intoduced D-branes in 
string theory as the end points of open strings. They could 
therefore in principle have dimensions p running from zero 
to nine. D-branes could be flat of infinite extent, or curled 
up into compact objects like black holes for example, they 
could be single or stacked up. Each type of string the- 
ory would allow for D-branes of specific dimensions. In 
superstring theory these p-dimensional D-branes, or Dp- 
branes, are dynamical objects which in the appropriate su- 
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pergravity limit correspond with the very heavy soliton-like 
p-banes. 


In Figure 1.4.58 we have depicted a stack of three colored 
D2-branes, and we have also drawn open strings ending 
on them. Let us discuss this picture in the weak coupling 
low energy limit of string theory, then it is known that open 
strings carry a vector representation of supersymmetry al- 
gebra. These open string states correspond to fields car- 
rying one space-time index u in this case running from 0 
to p = 2, and carry two ‘internal’ (color) indices labeling 
the branes to which they are connected, in other words 
they beg to be identified as gauge fields Aab. In the figure 
there are nine possible color combinations, making up a 
three-dimensional U(3) ‘color gauge theory. The picture 
clearly shows that the gauge theory is attached to the Dp- 
brane and therefore (p+1)-dimensional. From the branes 
point of view the strings between them are like excitations 
of the branes, they describe the brane dynamics. If the D- 
brane is embedded in a higher dimensional space then we 
have that the closed strings live in a higher-dimensional 
(d > p) space then the gauge fields as indicated in the 
figure. This fact posed yet another conundrum one had to 
face. One additional comment on the figure, imagine the 
the D-branes to be so-called black branes meaning that 
they would correspond to some horizon then one could 
imagine the open strings pairing up to make some closed 
strings which could then leave the D-brane. The brane 
would radiate! 


M-theory and superstring dualities. 


This theory, which is sometimes called M-theory 
(according to taste, M stands for magic, mystery, 
marvel, membrane or matrix), is seen by many as a 
likely candidate fo a complete description of nature. 

Edward Witten, Nature (1996) 


In Figure 1.4.60 we give a bird-eye’s view of the modelling 
landscape in ten and eleven dimensions. Who is living 


Figure 1.4.59: Duality. In this painting of the Renaissance Mi- 
lanese painter Guiseppe Arcimbolo (1526 — 1593), we see that 
a given physical vegetarian reality, this particular painting called 
Verdure or Vegetables, has two different interpretations which 
are dual to each other. In the weak coupling limit it is a veg- 
etable basket, in the — | presume — strong coupling limit it turns 
into a vegetable face. The transformation is a rotation over an 
angle of 7. (Source: ©Photo Scala, Florence.) 


where and how they are related. The precursor of string 
theory was 11-dimensional supergravity: it did have attrac- 
tive features like for example classical 2- and 5-brane so- 
lutions but seemed to not be fully consistent. In the fig- 
ure it has moved a bit to the background because it is 
presently understood to be the low energy approximation 
of M-theory. The magic theory that remains in many ways 
a mystery, as there is no explicit formulation available, and 
we don’t even know if such a formulation exists. And there- 
fore the most fundamental principle underlying M-theory 
may still be hidden as well. 


M-theory is known because of its low energy supergrav- 
ity manifestation and through other limits which manifest 
themselves as the five superstring theories in ten dimen- 
sions. These limits are related to compactification of one 
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Figure 1.4.60: M-theory.and string dualities. M-theory is a the- 
ory in eleven dimensions. It contains all five 10-dimensional su- 
perstring theory types. These ten-dimensional theories can be 
related through certain S- and T-dualities. 


space dimension. For example the low energy limit — su- 
pergravity — can be compactified over a circle to ten dimen- 
sions, where the supergravity 2-brane is wrapped around 
that circle, which reduces the 2-brane to a string. And the 
resulting theory could be identified as the weakly coupled 
ten-dimensional Type 2A string theory. In the strong cou- 
pling limit of the string theory the compactified dimension 
would open up as we have discussed before. Furthermore, 
in the Type IIA string theory one could do a T-duality trans- 
formation (like R — 1/R), which turns the theory into the 
Type IIB string theory. So it is in this sense that many con- 
nections between the various models were established, 
and indeed this network of dualities clearly demonstrated 
that these five theories are basically five different guises of 
one underlying theory, which has been named M-theory. 
Quite a mind-boggler! 


At this point of the tour we have arrived at the center of 
the third string era, and | could imagine that if you are 
a ‘freshman’ reader, not at all familiar with these ideas, 


this narrative will come across as an arcane, brilliant but 
bizarre endeavor. A type of excursion in domains of the 
mind that you would not expect in a book about physics, a 
discipline that stands out for its factual rigour and its exem- 
plary strong empirical basis. 


The reason that | include these developments is exactly 
because this is what the struggle of science at the frontiers 
of knowledge may look like and should look like. It should 
be explorative in all conceivable ways, as long as it is not 
plainly stupid. This holds for the experimental as well as 
theoretical domain. It was Einstein who in 1934 made the 
following remark: 


The theoretical scientist is compelled in an increas- 
ing degree to be guided by purely mathematical, 
formal considerations [...]. The theorist who under- 
takes such a labor should not be carped at as “fan- 
ciful’; on the contrary, he should be granted the 
right to give free rein to his fancy, for there is no 
other way to the goal. 


Let us take this advice to heart and embark on exploring a 
final set of far reaching ideas that got a spectacular impe- 
tus out of string theory. 


Holography and the AdS/CFT program 


We would like to advocate here a somewhat ex- 
treme point of view. We suspect that there simply 
are no more degrees of freedom (inside a volume) 
to talk about than the ones on can draw on its sur- 
face as given by S = A/4. The situation can be 
compared with a hologram of a three-dimensional 
image on a two-dimensional surface. The details 
of the hologram on the surface are intricate and 
contain as much information as it is allowed by the 
finiteness of the wavelength of light-read the Planck 
length. 

Gerard ’t Hooft Salamfestschrift (1993) 
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Holography. We are going to discuss a profound novel 
correspondence that is of interest to physics and mathe- 
matics of many sorts. It is the holographic idea, that all the 
information contained in a space of (d+1) dimensions can 
be faithfully represented on a holographic screen of d di- 
mensions. In the context of gravity, this conjecture was first 
put forward in the quantum understanding of black holes 
by ’t Hooft as early as 1993 and taken up shortly after by 
Susskind in the context of string theory. We have talked 
about black hole entropy and the information paradox in 
the section on black holes in the previous chapter on page 
139. The current narrative is that information cannot get 
lost but instead is somehow encoded, ‘frozen in’, on the 
horizon. The horizon keeps track of all things that pass 
by, so to say. This information content corresponds exactly 
with the Bekenstein-Hawking entropy that is also located 
on the horizon of the black hole. It would allow for the 
possibility that the information would ultimately be carried 
away again as hidden correlations in the Hawking radia- 
tion. So the information carried by things that have fallen 
into a black hole can — in principle at least — be retrieved. 
In particular in a consistent quantum mechanical descrip- 
tion of the black-hole formation and evaporation process 
this has to be the case. 


Black hole holography. The study of holography applied 
to black holes in the context of string theory culminated 
in a 1997 paper by Juan Maldacena at the Institute for 
Advanced study in Princeton, in which he made a strong 
claim of an exact dual relation between a superstring the- 
ory in a 5-dimensional Anti de Sitter space denoted as 
AdSs, and an V = 4 supersymmetric SU(N) gauge the- 
ory for large N defined on the boundary of AdSs, which is 
equivalent to four-dimensional flat Minkowski space. This 
exceptional claim has in the meantime been substantiated 
and extended in many very convincing ways. 


It brings together a number of ideas that we have touched 
upon in this book: the idea of symmetric curved space- 
times that are solutions to Einstein’s equations, the idea 
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Figure 1.4.61: AdS/CFT correspondence. The superstring the- 
ory compactifed over S° to a five-dimensional Anti de Sitter 
(AdS) space-time, which corresponds to the interior of the cylin- 
der. That space has a 4-d boundary which is a flat Minkowski 
space (the cylindrical surface) on which a conformal field theory 
(CFT) lives, which is the hologram, a fancy encoded but faithful 
representation of the five-dimensional string theory in the inte- 
rior. 


of supersymmetry and supergravity, the N = 4 super- 
conformal gauge theories, and the ideas of string theo- 
ry/supergravity compactification. Indeed a more encom- 
passing confluence of ideas is hard to imagine and it may 
rekindle dark memories of some horrifying final exam that 
you once failed. | am sorry! 


To be slightly more specific we are discussing the 10-di- 
mensional IIB string and supergravity models, which are 
defined on a ten-dimensional space-time M = AdS; @ S°, 
then on the AdS space one ends up not with 8 but 4 su- 
persymmetries, the other four get broken by the compact- 
ification. Furthermore the background space AdS; is a 
very special space that has a large symmetry group which 
corresponds with the four-dimensional conformal group, 
and this yields a space-time theory with a N = 4 super- 
conformal symmetry. 
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Figure 1.4.62: AdS/CFT correspondence. An artist’s impres- 
sion of the holographic principle as it is operative in a compacti- 
fication from ten to five dimensions. The superstring theory lives 
in a five-dimensional Anti de Sitter (AdS) space-time (inside the 
bowl). That space has a 4-d boundary which is a flat Minkowski 
space (the blue bowl) on which a conformal field theory (CFT) 
lives, which is the hologram, a fancy encoded but faithful rep- 
resentation of the five-dimensional string theory in the interior. 
(after A.T. Kamajan) 


This string theory has actual open and closed strings but 
also D4-branes and we put a stack of N at the bound- 
ary, so producing an N = 4 super-conformal SU(N) gauge 
theory at the boundary. Note that the symmetries of both 
theories coincide and therefore that will facilitate the iden- 
tification of string states in the bulk with particular observ- 
ables in the quantum field theory. 


This gauge/gravity duality is a strong-weak type duality, 
which means that the low energy weakly coupled gravi- 
tational theory tells us about the strong coupling behavior 
of the quantum field theory. And as there is a complete 
equivalence the converse is also true, so the weakly cou- 
pled gauge theory should teach us about strongly interact- 
ing string theory. 


If you look at the Hilbert space of states or the spectrum 
of the string theory on AdS; , you get an impression how 
involved and surprising this Maldacena correspondence 
must be. For very low energy the theory of gravity just is 
the Einstein equations linearized around the background, 
and the excitations are gravitational waves. If we move to 
string theory, we get the closed strings which represent the 
massless supergravity degrees of freedom and if the back- 
ground has D-branes we will excite open strings attached 
to branes. After that massive string modes will also be ex- 
cited. When we go up even further in energy we will enter 
the regime where D-branes will be created. And if they are 
sufficiently heavy, they may start to form small black holes. 
And the more energy we put in the heavier and larger the 
black holes become. This process can continue until the 
horizon coincides with the boundary of a very large black 
hole. To imagine that this great variety of interacting de- 
grees of freedom can be faithfully mapped to a large N 
super-conformal gauge theory is quite miraculous. What 
| can tell you is that a great variety of checks has been 
performed (involving very extensive computations of par- 
ticular features) and all of these have confirmed the ex- 
pectations. 


However, the real shortcoming of this correspondence is 
that it only seems to work in this quite exceptional geom- 
etry with the serious problem that it involves the Anti de 
Sitter space which has a negative cosmological constant. 
This is in direct contradiction to the well-established fact 
that our universe has a small but definitely positive cosmo- 
logical constant. This appears to pose a serious challenge 
to the AdS/CFT programme. The question is whether there 
is some version of a duality that holds also in De Sitter 
space. 


Emergent gravity. This brings us to a brief description of 
the still rather controversial idea of ‘emergent’ or ‘entropic 
gravity’ and its protagonists like Erik Verlinde and collab- 
orators. They have addressed the question of what could 
happen if we move from an Anti de Sitter to a De Sitter 
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background with positive cosmological constant. It is sug- 
gested that a rigorous holographic image on the boundary 
is no longer what happens, instead the speculation is that 
the entropy will acquire a volume term and spread from the 
boundary into the space, maybe causing deviations from 
Newtons laws that look like the effects of dark matter and 
dark energy. If confirmed by further analysis that would 
certainly constitute another truly stunning result. 


At home in the quantum world 


Before | came here | was confused about this sub- 
ject. Having listened to your lecture | am still con- 
fused. But on a higher level. 

Enrico Fermi 


We have come to the end of the first part of the book. This 
part was devoted to the basic concepts and contents of 
quantum theory and its classical roots, as it developed over 
the last century. We of course always have lived in a quant- 
essential world, but it is only now dawning upon us what 
that means. We started by describing the gems of classical 
physics, which ran into a number of serious troubles that 
could only be resolved by embracing the quantum princi- 
ples. In this chapter we described the subsequent suc- 
cesses of applying the quantum principles to ever deeper 
layers of the microscopic world. A journey that as we saw 
is by no means completed. 


Quantum theory entered our thinking on the atomic scale, 
say at 10~'° meters, and from there it started spreading. 
We recall that there are two ways to go from there and 
extend the results. The first is to go to ever smaller scales, 
and that is the route we have followed in this first part of 
the book. We went all the way down from the atom, via the 
nuclear structure to the elementary quarks and leptons, 
to a scale of about 107% meters, the scale accessible to 
modern accelerators like the LHC at CERN. 


Figure 1.4.63: It fom Bit. In a famous essay John Archibald 
Wheeler pondered over the philosophical ramifications of the 
idea that Information lies at the basis of our universe. Is it possi- 
ble that All of Nature grew out of information only? The Cosmic 
Code as an all-embracing Hyper Genetics. This intriguing image 
also symbolizes evolution, or how Nature seems to be in search 
of itself, through the human effort of scientific inquiry, which by 
definition is part of that Nature. 


The other way is to go up and apply quantum theory on 
scales corresponding to chemistry, or the many other forms 
of condensed matter that we find in nature or create in the 
lab. We save this part of the quantum story for the final Vol- 
ume. The next, middle part of the book is called quantes- 
sence, and is devoted to the more formal aspects of quan- 
tum theory. We will expose some of its very rich logical 
and mathematical structure and comment on it. | think it 
would be a poor choice to leave it out, exactly because it is 
a central part of quantum theory. We can’t really do with- 
out because what makes the theory so attractive is that 
on a conceptual level it is so counter-intuitive. And where 
confusion reigns it is vital to keep the language as clean as 
possible as to be sure about what the questions are and 
what the answers mean. | don’t know of any theory where 
the mathematical framework is so rich and unambiguous, 
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and at the same time the narratives and interpretations are 
so paradoxical and hard to grasp. This makes the theory 
exciting and we will discuss a number of well-known but 
stupefying paradoxes in the next Volume. We are all set to 
start climbing the amazing mount quantum. 


If the world ‘out there’ is writhing like a barrel of 
eels, why do we detect a barrel of concrete when 
we look? To put the question differently, where is 
the boundary between the random uncertainty of 
the quantum world, where particles spring into and 
out of existence, and the orderly certainty of the 
classical world, where we live, see, and measure? 


This question...is as deep as any in modern phys- Further reading 
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The other side is usually a dark place? 


Not necessarily. | think it has more to do with 
curiosity. If there is a door and you can open 
it and enter that other place, you do it. It’s just 
curiosity. What’s inside? What’s over there? 
So that’s what | do every day. [...,] once | start 
writing, | go somewhere else. | open the door, 
enter that place, and see what's happening 
there. | don’t know-or | don’t care-—if it’s a re- 
alistic world or an unrealistic one. | go deeper 
and deeper, as | concentrate on writing, into 
a kind of underground. While lm there, | en- 
counter strange things. But while I’m seeing 
them, to my eyes, they look natural. And if 
there is a darkness in there, that darkness com- 
es to me, and maybe it has some message, 
you know? I’m trying to grasp the message. 

So | look around that world and I describe what 
| see, and then | come back. Coming back is 
important. If you cannot come back, it’s scary. 
But I’m a professional, so | can come back. 


The Japanese author Haruki Murikama in 
an interview by Deborah Treisman in The New 
Yorker (2019) 
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Chapter II.1 


The quantum formalism: states 


There’s no sense in being precise when you don’t 
even know what you're talking about. 
John von Neumann 


Quantum theory has kept the community of physicists un- 
der its spell for over a century. It has opened new hori- 
zons for understanding a myriad of fundamental phenom- 
ena that were observed at ever deeper levels of nature, 
and it has produced a huge quantity of crucial results for 
the applied sciences. It has manifested itself in virtually all 
subfields of physics and from there entered into other ad- 
jacent fields like chemistry, engineering, informatics and 
even biology. And this process is still going on. 

In this Volume we focus on the ‘quantessential’ features of 
the theory. This means that we will go into more detail with 
respect to the mathematical formalism underlying the the- 
ory. For pedagogical reasons we will apply it only to simple 
systems, and this may well give the impression that | am 
using a sledgehammer to crack peanuts. 


The basic structure of the theory we are about to explore 
has far-reaching logical consequences. It will keep us busy 
in the following chapters on qubits, measurements, inter- 
ference, entanglement and dynamics. We develop these 
concepts starting from the perspectives of classical phys- 
ics, quantum physics and information physics. The starting 
point is always to define the system by the identification 
of its ‘degrees of freedom’ or basic dynamical variables. 


These can be ‘external’, like position, momentum, angular 
momentum or energy, or ‘internal’ where one may think of 
electric charge or something more exotic like intrinsic spin, 
isospin or color charge. 


In Chapter Il.1 we focus on the basic notions related to 
quantum states, such as state vectors, Hilbert space, sep- 
arable versus entangled states, pure versus mixed states 
and the concepts of a density matrix and quantum entropy. 
In Chapter II.2 we discuss the notions of observables as 
operators, and the probabilistic nature of a quantum mea- 
surement. We also introduce the concept of incompatible 
observables, frames of reference and the Heisenberg un- 
certainty relations. 

Chapter II.3 is about quantum interference in various dou- 
ble slit type of experiments, but also its manifestation in the 
so-called Berry phase. 

In Chapter II.4 we turn to quantum teleportation and quan- 
tum computation. Teleportation is the consequence of the 
quantessential possibility of entangled states, which will be 
illustrated in a number of famous experiments and para- 
doxes. The results of recent experiments lead to the in- 
escapable conclusion that quantum theory is correct. This 
means that theories built on hidden variables and local re- 
alism are no longer tenable in view of these experiments. 
Concerning quantum computation we introduce the no- 
tions of quantum gates and circuits, and discuss the fac- 
torization algorithm of Shor in some detail. 
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CHAPTER II.1. THE QUANTUM FORMALISM: STATES 


In Chapter II.5 we turn to the quantum theory of particles, 
fields and strings and illustrate a number of quantessential 
properties, such as the quantum statistics of particles and 
the spin-statistics connection. Volume II closes with Chap- 
ter Il.6, where we give an overview of the role that symme- 
try and symmetry breaking play in physics and quantum 
physics in particular. 


Quantum states: vectors in Hilbert space 


If we describe a physical system in the classical realm, the 
relevant variables like position, velocity or momentum and 
energy are part of the definition of the system. They are 
observables in that we can measure them, thereby pro- 
ducing dimensionful values as an outcome. 

We have mentioned what in quantum physics the states 
look like: they are vectors in some rather abstract state 
space called the Hilbert space, and in this section we will 
show how and to what extent the ordinary physical vari- 
ables can be retrieved from the state vector. 

The crucial fact is that in the quantum formalism observ- 
ables are not represented by just numbers but are defined 
as matrices or operators acting on the state space. That 
sounds complicated, and yes, it is. It illustrates a remark 
made by Paul Dirac who stipulated that matters, which at 
a certain moment may be considered merely as pastimes 
for mathematicians and logical thinkers, may turn later into 
tools that are indispensable for understanding nature. And 
if understanding nature is our goal it may be worthwhile 
to familiarize ourselves with these mathematical concepts, 
just like the pioneers of quantum theory had to do a cen- 
tury ago. 


In this chapter we point out the quantessential differences 
between classical and quantum systems for the simplest 
of all quantum systems, the quantum spin or qubit. This 
two-level system plays a fundamental role in many appli- 
cations of quantum theory, but is also a favorite toy-model. 


The ability to control and manipulate arrays of qubits is the 
holy grail of quantum technology as it entails the produc- 
tion of quantum information processing devices that en- 
able for novel applications, varying from quantum key dis- 
tribution and teleportation to quantum computation. It is a 
major challenge to find physical implementations of a ba- 
sic qubit that can be reliably manipulated and at the same 
time can be scaled to large arrays. 


Reader alert. Remarkably, in talking 
about quantum concepts and mean- 
ing, formulas are often easier to under- 
stand than words. However, if you are 
not familiar with the notion of operators and matri- 
ces, don’t despair! The philosophy of the book is 
not to shy away from them, but to plug and play 
with them in the simplest imaginable cases to gain 
familiarity with them. As with driving lessons, you 
don’t have to drive all the way from Spokane to Mi- 
ami Beach and back to get a proper appreciation for 
what a highway is. | kindly request that you accept 
the definitions for what they are, then we will play 
around a bit so that you will end up throwing matri- 
ces around like ordinary numbers. 

| will supplement the rather abstract algebraic lan- 
guage of matrices and the like, whenever possi- 
ble, by more geometric images; for most people 
imagery provides more insight and is easier to re- 
member. And talking about vectors and matrices, | 
should like to remind you of the respective Math Ex- 
cursions at the end of Volume III, because those in- 
tros will make understanding the forthcoming chap- 
ters a lot easier. The use of a symbolic language 
will at least keep us from slowly getting lost in a 
dense fog of ever more cryptic quantum terminol- 
ogy and quantum vagueness. Take my word, or 
rather, my equations for it. L 
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This challenge is approached from many different angles, 
like quantum optical systems, superconducting devices, 
atoms in optical latices, ions in traps, and topologically or- 
dered phases. Progress is rapid which means that quan- 
tum devices exploiting the fundamental features of quan- 
tum theory may well be with us in a decade or two. 


Quantum versus classical 


| think it is safe to say that no one understands 
quantum mechanics. Do not keep saying to your- 
self if you can possibly avoid it: ‘But how can it be 
like that?’ because you will go down the drain ‘into 
a blind alley from which nobody has yet escaped. 
Nobody knows how it can be like that.’ 

Richard Feynman 


We start by comparing the quantum and classical world 
generally. The fundamentally different concepts and for- 
mulations have profound consequences for the logical and 
deductive structure of the theories. Where do these worlds 
meet or separate? Actually, do they? 


Classical systems. In classical physics it is usually quite 
obvious what the system consists of and what the possi- 
ble states are. If we talk about a particle for example we 
will typically specify the state by assigning ita mass m, a 
position x, and a velocity v. Given the state at some ini- 
tial time, Newton will tell us what the state will be at any 
later time, provided we know the forces that act on the par- 
ticle along the way. For a field like the electromagnetic 
field we specify the field configuration, by which we mean 
that we give the electric E and magnetic B fields over all 
of space. Then the Maxwell equations tell you all about 
the time development of that initial field configuration, pro- 
vided we know what the external charges and currents, 
usually called sources, are. The evolution of the gravita- 
tional field is described in a similar way by the Einstein 


equations. Subsequently we have to combine the frame- 
works of Newton, Maxwell and Einstein to get the actual 
time development of the complete classical system of par- 
ticles with and without charge and gravitational and elec- 
tromagnetic fields. The structure of the theory is absolutely 
unambiguous, based on a clear methodology. 


Yet, the coupling of the different components of fields and 
sources makes the system extremely nonlinear and there- 
fore hard to solve explicitly. For example there is the in- 
tricate problem of the ‘back reaction’: the fields will not 
only change as a consequence of the movement of the 
charges, but in addition the accelerated charges will radi- 
ate. There are certain simple cases that can be dealt with 
analytically through closed expressions in terms of stan- 
dard functions, but mostly that is not the case. Whereas 
we can solve the Newtonian two-body problem analytically, 
this is not the case for the three-body problem. One has 
to resort to numerical procedures which can become ex- 
tremely cumbersome, if one insists on high accuracy, which 
is the case if one wants to make predictions about the be- 
havior of the system on long time-scales. This point leads 
us to an additional observation that should be made con- 
cerning classical physics. 


Nonlinear dynamics and deterministic chaos. We just 
stated that if we know for example the position and veloc- 
ity of a particle at a given instant in time, the time evo- 
lution is completely fixed by Newton’s laws provided we 
know the forces acting on the particle. This implies that 
any uncertainty in its evolution is driven by the limited ac- 
curacy of the initial conditions. This is not as innocuous as 
it sounds even if one has a huge zoo of advanced com- 
puters at one’s disposal. What we have learned in the 
last half century from studying simple nonlinear systems 
is that already on a classical level, such systems — in spite 
of being completely deterministic — can exhibit chaotic be- 
havior. In such situations it is not possible to make precise 
long-term predictions, because small initial uncertainties 
can be amplified exponentially in time by the chaotic dy- 
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namics of the nonlinear system. These systems exhibit an 
extreme sensitivity on initial conditions often referred to as 
the butterfly effect, meaning that a tiny change in the initial 
condition may lead to vastly different consequences a rel- 
atively short time afterwards. However, what concerns us 
here is that within classical physics there is no fundamen- 
tal limit on the accuracy of measurements — by measuring 
more and more carefully, we can predict the time evolu- 
tion of a system more and more accurately. The system is 
fundamentally deterministic. This is no longer true in the 
quantum world because there we will run into a fundamen- 
tal limit on the accuracy of the simultaneous measurement 
of physical observables. 


The correspondence principle 


Where classical and quantum meet. At the most basic 
level there are fundamental differences between the classi- 
cal and the quantum frameworks. On macroscopic scales, 
meaning relatively large scales of space, time and energy, 
where we know classical physics works well, the predic- 
tions of classical and quantum theories of course have to 
agree. This requirement is known as the correspondence 
principle. There is no logical path that brings you from 
classical physics to quantum physics, but the converse is 
certainly possible and even mandatory. We should insist 
on understanding the emergence of all of classical physics 
from the underlying quantum description. This turns out 
not to be straightforward at all, but then, nobody promised 
us it would be. In Figure ll.1.2 we have symbolically in- 
dicated the classical and quantum worlds. We contrast 
the direction of the historical process of scientific evolu- 
tion, moving us out of the classical into the quantum do- 
main, versus the direction of logical deductions and im- 
plications which go the opposite way. It warns us that 
we should not strive for an interpretation or representa- 
tion of quantum content in classical terms, that would be 
a terribly misguided effort indeed. So, historically, quan- 


Classical > 
world 


v 


Ir 


t. 
History 


Ap 


Figure Il.1.2: Classical versus quantum. We were born in 
a classical world, but after exploring the nature of things we 
have discovered the existence of a much larger quantum world. 
Once these discoveries were made, we understood that the 
logic should be reversed: it is the classical world that can be 
logically deduced from the quantum world, and not the other 
way around. 


tum theory emerged out of the classical theories, but logi- 
cally it is the other way around, and that is inherent to the 
way knowledge transcends itself in the process of scientific 
progress. 


Classical phenomena with quantum explanations. As 
we discussed in the previous Volume, for example in Chap- 
ter 1.2 , the scale of the quantum regime is set by Planck’s 
constant h, or h-bar defined as h = h/2z71, which has di- 
mensions of energy x time (or equivalently momentumx 
length). Because of the tiny value of this constant, we ex- 
pect the quantum properties to become manifest at small 
time and length scales, and low temperatures. However, 
collective macroscopic behavior is to a large extent an indi- 
rect manifestation of the properties of the basic constituents 
of the system, and of the interactions between them and 
the environment. After all, not withstanding the striking 
similarities between an ant colony and human society, the 
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even more striking differences between them can be largely 
traced back to the differences between an individual hu- 
man being and an individual ant. Looking at matter in 
a similar way, one expects that radically different prop- 
erties at a microscopic scale (say at the level of atomic 
and molecular structure) may in turn lead to fundamentally 
different collective behavior of these basic building blocks 
and therefore to different emergent properties on a macro- 
scopic scale. So, one certainly should expect quantum 
manifestations on a macroscopic scale after all. Indeed, 
most phases of condensed matter realized in nature, such 
as crystals, ordinary conductors, semiconductors, super- 
conductors or magnetic materials, all involve forms of col- 
lective behavior that can only be understood from a quan- 
tum perspective. The (meta-)stability and structure of mat- 
ter is intimately linked to the quantum behavior of its fun- 
damental constituents. 


The quantum domain. Returning to the question of states, 
as we will see in this and the following chapters, the quan- 
tum states of bits, particles or fields are very different from 
their classical precursors and in the beginning it was even 
far from evident what the space of states would be. How- 
ever, once we found out, we learned that the structure of 
the state-space tells us a lot about the generic features of 
quantum systems and how these may radically differ from 
their classical analogues. Studying the underlying mathe- 
matical structure will enable us to anticipate what we might 
expect in real physical situations. With some exaggeration 
one could say that everything that is not forbidden is com- 
pulsory, and henceforth will manifest itself somewhere in 
Nature. Nature is quantum. 


Many exotic quantum features like particle interference or 
entanglement derive directly from its underlying structure, 
but that didn’t make it any easier to demonstrate these fea- 
tures through experiment. Many predictions of quantum 
theory have lingered on the margins, waiting for experi- 
mental techniques to develop to the required level of pre- 
cision. There are quite a few examples where it has taken 


more than half a century before predictions could be put to 
the test. Science requires not only brilliance but also pa- 
tience. Nowadays, many quantessential phenomena can 
be beautifully demonstrated by experiments exploiting su- 
perconductivity and quantum optics. There is still much 
more to discover, which is why we want to explore these 
quantum state spaces and their remarkable properties in 
this separate second volume. Whereas the present state 
of modelling real systems in nature within the quantum me- 
chanical framework is described in the Volumes | and III, 
this volume is dedicated to the ‘cosmic code’ itself. 


Classical states: phase space 


The state at some time t of a classical system is speci- 
fied by assigning values to a minimal subset of dynami- 
cal variables from which all possible other variables can 
be calculated. We say that the state of the system corre- 
sponds to a point in phase space Fy, . We are going to 
discuss the case of a basic particle and work out the dis- 
crete ‘Newtonian’ dynamics of an Ising spin or classical bit 
as an example. 


Phase space. To specify the state of a simple particle, 
which may have a mass m and a charge q, we have to 
give its position x and its velocity v or momentum p = 
mv. The space of positions is usually called configuration 
space and denoted as -V . In three-dimensional space both 
position and velocity have three components because they 
are vectors , and thus the phase space Fy, ~ {x, p} has 
six dimensions. From the point of view of particle dynam- 
ics, mass and charge are just fixed external parameters. 
Note that other dynamical variables of a particle, like its 
energy or angular momentum, can be expressed in terms 
of velocity and position and therefore can be calculated 
once the point in phase space is given. 


A property corresponds to a subspace of the phase space. 
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A state of the system can be assigned a property, in the 
sense that one can decide whether a property is true or 
false by determining whether the point representing the 
state of the system is lying in or outside that subspace. 


The dynamical system will develop in time according to 
some dynamical equations like Newton’s equations of mo- 
tion, and the point describing the state will move in phase 
space correspondingly. Furthermore in classical physics it 
is assumed that the point can in principle be determined 
to arbitrary precision by a simultaneous measurement of 
the basic variables thereby fixing the point in phase space. 
And one also assumes that observations can be made 
which do not disturb the system, and hence do not affect 
the trajectory in phase space. These assumptions are an 
essential Volume of the classical physics paradigm. 


The mechanics of a bit 


Let us now turn to a system even simpler than a single 
particle, which | call a dynamical bit. We are going to do 
a bit of bit mechanics. | have chosen this system because 
it links basic classical mechanics to basic information the- 
ory, and defines a simple quantum system as well. As we 
all know, a bit has two states (positions) labeled z = 0 
and z = 1, so its configuration space consists of two iso- 
lated points. Introducing a discrete time step (like the clock 
in a computer) allows us to define a discrete dynamics. 
We distinguish two possibilities: after the time step the bit 
changed to the other state or it stayed where it was. This 
begs for an additional binary state variable which we ap- 
propriately call the bit-momentum p . So its value labels 
two distinct states of motion, where p = 0 means “at rest’ 
or p = 1 meaning ‘on the move. 


Both the classical position and the classical momentum 
space consist of two points, and therefore both bit-position 
and bit-momentum are binary variables, which means that 


Figure Il.1.3: Phase space. The phase space of the dynamical 
bit consists of four points. 


all values can be added mod 2, meaning in particular that 
14+1=0. 


Binary mechanics. The phase space for this dynamical 
bit corresponds to four points 


Fon = {p,Z} = {0, 0;0, 1; 1,0; 1, 1} 


as indicated in Figure Il.1.3. To push the comparison with 
Newtonian mechanics even further, one could say that the 
dynamical state in the absence of further interactions would 
be characterized by the conservation of momentum. Then 
with p = 0 the bit would be ‘at rest’ indefinitely, in which 
case the position is conserved as well, but with p = 1, 
the bit stays constantly hopping between the two position 
states. Depending on the initial condition one finds two 
fixed points and one two-cycle. The phase space picture of 
the possible dynamics is given in Figure Il.1.4 (top). Maybe 
you have already noted the amusing possibility of intro- 
ducing a bit-force F , defined a la Newton as the change 
in bit-momentum. Also F takes a binary value; F = 0 
leaves the momentum unchanged, while with F = 1 the 
momentum value changes, which leads to a different dy- 
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Figure II.1.4: Bit mechanics. Phase space picture of ‘Newto- 
nian’ bit-dynamics with a binary force F being either 0 (top) or 1 
(bottom). For F = 0 there are two fixed points and one two-cycle, 
for F = 1 there is only one four-cycle. 


namic consisting of the four-cycle depicted in Figure II.1.4 
(bottom). These variables are elements of a Boolean al- 
gebra, discussed in the Math excursion on algebras in Vol- 
ume III. 


Complementary representations. The system is clearly 


completely deterministic, because given the initial binary z 
and p values, its future states after an arbitrary number of 
time steps can be calculated. These discrete dynamics are 
like a little automaton, an updating procedure for the z-bit 
that depends on the p-bit. Updating means that the states 
of the two-bit system change and therefore the dynamics 
define a logical gate in the sense of digital computation. 
So we have arrived at four alternative ways to characterize 
the dynamics of the bit: 

(i) as an updating algorithm or iterative map |in) — |out) , 
(ii) as a diagram representing the gate, 

(iii) as a two-bit to two-bit input-output table, 

(iv) and as a 4 x 4 matrix acting on the column vector of 
two-bit in-states (p, z) = {0,0;0, 1;1,0;1, 1}. 

For F = 0 this looks as follows: (i) the algorithm generating 
the dynamics is just, 


(p,z) > (p, (z+ p) mod 2), 


which corresponds to the (ii) diagram, (iii) state map, or (iv) 
the (block-diagonal)matrix as given in Figure Il.1.5. 


Gates and information dynamics. From the picture we 
learn that the two-bit dynamic is in fact generated by a 
two-bit gate which is well known as the controlled NOT- 
or CNOT-gate. The diagram should be read as follows: 
the horizontal lines correspond to the two incoming (left) 
and outgoing (right) bits. It is a conditioned gate, which 
is indicated by the vertical line from the p-bit to the z-bit. 
The encircled plus symbolizes a NOT-gate acting on the 
z-bit, but its action is conditioned on the value of the p-bit: 
it is activated if p = 1 and not if p = 0. The dot on the 
p-line indicates that it is the control bit, not changing value 
by passing the dot. With this interpretation it is straightfor- 
ward to compute the entries of the input-output table. One 
puts the input state on the lines at the left and then follows 
the lines through the diagram to the right performing the 
instructions one encounters. 


This matrix acts like a permutation matrix on the input col- 
umn vector of two-bit in-states. Indeed, we see that on the 
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0 1 0 O |1 O 
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Figure Il.1.5: Three representations. The F = 0 bit dynamics 
is generated by the CNOT-gate. In the ‘block-diagonal’ matrix 
representation on the right, we marked the two fixed points and 
the two-cycle. 


top two entries it acts like a unit matrix, while on the bottom 
two entries (x) it acts like a NOT-gate. 


The NEWTON gate. Imagine that we also include the ‘bit- 
force’ we defined as a third force-bit F. Then we obtain 
an interesting three-bit gate for the complete dynamics of 
the system. One finds that it can be characterized by the 
updating algorithm: 


(F,p,z) > (F, (p + F) mod 2, (z +p) mod 2), 


which corresponds to the diagram and state map of Fig- 
ure Il.1.6 and the matrix in equation (II.1.1). 


On the first four rows it acts like a CNOT, and in the second 
block it performs some sequence of permutations. In that 
sense this NEWTON-gate actually computes something 
on three bits, but from the diagram we see that it is not 
an irreducible three-bit gate, rather it is composed of two 
sequentially applied CNOT-gates. It has the following 8 x 8 


NEWTON 
jin) Jout) 
F Oo 
o/olilflo}o}1 
0 0 0) a] 2 
p o LITI o T O 
RON ON RARO 
Toe ENE 
Z ONEONE 
a | 6 lio 


Figure 1I.1.6: NEWTON-map. The three-bit NEWTON-gate 
and the corresponding |in) — |ouwt) map acting on the column 
vector of (F, p, z) states. 


matrix structure in a basis given by the first three columns 
of the |in) states of the table in Figure II.1.4. Note that 
due to the four bottom entries corresponding to F = 1, 
the fourth power of the NEWTON-gate is equal to the unit 
matrix. Hence, the dynamics generated has indeed pe- 
riod four, as one would expect if the force is constant. That 
causes p to hop with period two and z with period four. It is 
the dynamics of the bottom diagram in Figure Il.1.4. 


HW] 0 0 0 00 0 0 
o l 0 0 000 0 
o 0 0 1] 0 00 0 
10 O {1 O| 0 0 0 0 
NEWTON : 0000001 aI: (11.1.1) 
Oo 0 0 0 0 0 0 ll 
0 0 0 000100 
0 0 00000 


The matrix corresponding to this NEWTON-gate, displayed 
above, is unitary in the sense that the transpose of the ma- 
trix is indeed its inverse. But the matrix is not symmetric, 
meaning that it is not a time reversal invariant process, be- 
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cause then the matrix would have to be its own inverse. 
This, however, is the case for the CNOT-gate represented 
by the matrix in Figure II.1.5. 


Conserved energies. In classical Hamiltonian mechan- 
ics one may derive the equations of motion, or the time 
evolution once the energy function is given, as we showed 
in Chapter 1.1. In the case of discrete dynamics it is less 
straightforward as we cannot take derivatives in the nor- 
mal way. Because all variables are binary, small variations 
are nonexistent! The role of the Hamiltonian is played by 
the updating algorithm because that generates the time 
translation of the system. It is that mapping, which by 
repeated application maps out the time trajectory of the 
system in phase space. In these discrete cases one may 
invert the question by asking whether there is a (binary) 
energy function E(z, p) that is conserved in the time se- 
ries, i.e. whose value does not change for the subsequent 
points on a given orbit in phase space. 


Let us look at some simple candidates. These can come 
across as slightly unusual, exactly because the energy is 
also a binary variable, implying that it can take only two 
possible values. The good thing about that is that the en- 
ergy stays always bounded and therefore the system is 
always well-defined. 


Example 1: E = p. You would expect the energy of a free 
particle to be proportional to p*, and since p is Boolean 
variable we have that p? = p. The free particle does not 
experience any force and so one expects that the Newto- 
nian dynamics rule (p,z) — (p,z + p) will apply. This is 
indeed the case where F = 0 which we discussed before 
and p is preserved. It has two fixed points with E = 0 and 
one periodic orbit of length two with E = 1: 


(0,0) ; (0,1) and (1,0) © (1,1) 


Example 2: E = F = 1 This is the case of a non-zero con- 
stant force conserved under the Newtonian rule (p,z) => 


(p+1,z+p). Its action corresponds to one periodic orbit 
of length four with energy E=1. 


(0,0) > (1,0) > (0,1) = (1,1) = (0,0) => = 


Example 3: Q = p +z. The function Q is a conserved 
‘charge’, or ‘constant of the motion’ under the clearly not 
Newtonian rule (p,z) = (p+1,z+ 1). Again it has two 
periodic orbits of length two which are now along the diag- 
onals of the phase space, one with E = 0 and the other 
with E = 1: 


(0,0) © (1,1) and (1,0) © (0,1) 


Quantum states: Hilbert space 


We discuss the generic setting of a quantum system. For 
a quantum system we have a set of states denoted {|)}, 
which are vectors that correspond to elements of the so- 
called Hilbert space H. of the system. The basic quantum 
setting introduces two novel ingredients, one is the com- 
plexification, and the other the linear superposition princi- 
ple of states. These have dramatic consequences. 


The Hilbert space of states. To explain the basic ideas of 
quantum theory, or for that matter of quantum information, 
we will in this section restrict our attention again mainly to 
the qubit, which can be viewed as the basic building block 
of quantum information systems. The physical state of a 
quantum system is described by a wavefunction which 
can be thought of as a vector in an abstract multidimen- 
sional space of states, called the Hilbert soace denoted by 
H . For the moment, this is just a finite dimensional vector 
space where the vectors have complex, rather than real, 
coefficients, and where the length of a vector is the usual 
length in such a space, i.e. the square root of the sum of 
the (absolute) squares of its components along the axes. 


254 


CHAPTER II.1. THE QUANTUM FORMALISM: STATES 


Hilbert space replaces the concept of phase space in clas- 
sical mechanics. Collections of observables, or measur- 
able variables such as spin, charge, position, or momen- 
tum, can be used to set up an orthogonal basis for the 
Hilbert space. 


As we will see, a dramatic difference from classical me- 
chanics with tremendous consequences is that many quan- 
tum mechanical quantities, such as position and momen- 
tum, or spin components along the x-axis and the y-axis, 
cannot be measured simultaneously. Another essential 
difference from classical physics is that the dimensionality 
of the state space of the quantum system is huge com- 
pared to that of the classical phase space. To illustrate 
this drastic difference, think of a particle that can move 
along an infinite line with an arbitrary momentum. From 
the classical perspective it has a phase space that is two- 
dimensional and real (a position x and a momentum p), but 
from the quantum point of view the particle is described by 
a wavefunction Y of one variable (typically the position x or 
the momentum p). The state is thus determined by speci- 
fying a function for all points x . As the state corresponds to 
a function, the state space must be a ‘space of functions. 
Formally such a wavefunction corresponds to an element 
of an infinite-dimensional Hilbert space which is a space of 
functions that satisfy certain restrictions. So, we go from 
two real numbers classically to a complex function of one 
variable in the quantum domain. That is quite a difference 
indeed! We will address the topic of quantum particles in 
detail in Chapter II.5. 


States of a quantum bit 


Now you might have thought that this is not such a big 
deal, because the classical state corresponds to a point in 
phase space and that point can be characterized by a vec- 
tor in phase space. But this is not the way to think about it. 
We just mentioned the dynamical bit as an example of an 


almost trivial dynamical system. To this classical system 
corresponds a quantum system called the quantum bit or 
gubit for short, and the statement is that to every point in 
the configuration space of the classical bit we associate a 
basis vector of the Hilbert space. So the bit-position space 
consists of two points {1,0}, and hence the Hilbert space 
of the qubit is two-dimensional and may be thought of as 
spanned by two orthogonal unit vectors {|1),|—1)}.! 


A general state of a qubit is described by a wavefunction 
or state vector |\p) , also called a ket or ket vector, which 
can be written as 


hp) = al +1) + BI— 1) with jal? + |B] =1, (11.1.2) 


where « and B are complex numbers.” Any linear com- 
bination of the two basis states corresponds to an admis- 
sible quantum state, as long as it satisfies the normaliza- 
tion condition, meaning that the sum of the squares of the 
components equals one. This means that you can think of 
hp} as a unit vector in the 2-dimensional complex vector 
space, denoted C? spanned by the two basis vectors |1) 
and|—1). 


The geometry of qubit state space. What we have 
learned so far is that a finite state classical system will lead 
to a finite-dimensional complex vector space for the corre- 
sponding quantum system. Let us describe the geome- 
try of the quantum configuration space of a single qubit 
in more detail. The constraint |x|? + |B|? = 1 says that the 
state vector has unit length, which defines the complex unit 
circle in C? , but if we write the complex numbers in terms 
of their real and imaginary parts as x = a; + iaz and 


1We switch from a ‘1, 0’ labeling in the classical domain, to a ‘1, —1’ 
labeling in the quantum domain, these are matters of notation and of 
mathematical convenience as we will see later. 

?For a tailor-made introduction to complex numbers and vectors see 
the Math excursions on pages 630 and 632 of Volume Ill. It is impor- 
tant for complex numbers that basic algebraic operations like addition, 
subtraction, multiplication and division can be defined. It is almost like 
in the musical Annie get your Gun: ‘Everything you can do | can do 
better. 
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Figure II.1.7: State decomposition. Decomposition of a real 
qubit state vector |p) , the purple arrow, into its components «œ 
and B with respect to the blue basis or frame {|+1),|—1)}. The 
circle represents the subspace of the real states, and in that 
case we Clearly have that «x = sin and B = cos@. We have 
marked some of the other real states that we will refer to in the 
text. 


B = bı + ib, then we obtain |a; + ia)? + |b; + ib2|? = 
a? + a} + b? + b4 = 1. The geometry of the space de- 
scribed by the latter equation is just the three-dimensional 
unit sphere S* embedded in a four-dimensional Euclidean 
space, R* with coordinates aj, az, bı , and bz . This three- 
dimensional sphere is in physics referred to as the Bloch 
sphere. 


Complex rotations. At this point it is appropriate to make 
a side comment. As the state of a qubit is a normalized 
two-dimensional complex vector, the state space of a qubit 
corresponds to a complex circle, which in turn equals S°. 
All states on the complex unit circle can be obtained by act- 
ing with all complex rotations on a given qubit state in CŽ. 
This is by definition the group SU(2) and having argued 
that these vectors can be transformed into each other by 
the elements U € SU(2), we can also conclude that the 


Quantization 


(1) (2) (3) 


Classical states © id 
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Quantum states ©&® 
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Figure Il.1.8: Configuration versus Hilbert space. A classical 
system with a configuration space corresponding to a set of 
three points. The quantum Hilbert space for this system would 
correspond to the unit-sphere in the complex three-dimensional 
space C?. In the figure we show the restriction of that space 
to real states forming a two-sphere. Classical and quantum 
spaces are structurally very different. There is a ‘world’ in be- 
tween which is described by the formalism we are about to ex- 
plore. 


space of all SU(2) transformations is in one-to-one corre- 
spondence with the points on the three-sphere S? . We will 
use these geometric representations of state spaces and 
transformation groups later on, because they are easier to 
understand than just formulas. 


Real states. For pedagogical reasons it is advantageous 
to limit ourselves for the moment to the subspace corre- 
sponding to real states. This means that one only con- 
siders states for which « and £ are real and the condition 
x? + B? = 1 imposes that the states lie on an ordinary 
circle in R? . The real states are depicted in Figure II.1.7, 
where we have also marked some special states. Many of 
the formal quantum properties can be explained within this 
real subspace. 
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Alternative notations. We may represent the state by the 
column vector of its components: 


veh) 


If you like you can also map the states of the classical con- 
figuration space in the quantum picture, then the classical 
bit would only have the two states | + 1) , corresponding to 
the basis vectors 


U 
TT 
ea 
O = 
Nea, 
ATN 
= O 
SE 


while the qubit can be any normalized linear combination 
of these two basis states. This makes the dramatic differ- 
ence between the classical and quantum setting quite vis- 
ible indeed. Each point in the configuration space Z of the 
classical system corresponds to an orthogonal basis vec- 
tor of the Hilbert space, and consequently adding a point 
to the configuration space Z adds a dimension to H . So 
in this picture the classical states correspond to the cor- 
ners of a unit hypercube in that higher dimensional space, 
while the quantum states lie on the unit-hypersphere em- 
bedded in that space. This is illustrated in Figure Il.1.8 for 
a three-state system. 


The scalar or dot product 


Ordinary, say ‘high school’ vectors are called real vec- 
tors. You may remember how the length |a| of a vec- 
tor a was defined as the square root of the sum of the 


squares of its components |a| = ,/at + a5 +.... And the 


dot or inner product of two vectors a and b, wassimilarly 
asa-b =) ajb; + a2b2 +... = |a|/b|cos0, with O the 
angle between them. 


Conjugate states. For the state or ket vectors |i) , we 
basically want to do the same thing, but because the vec- 
tors are complex, it is slightly more complicated. However, 


once you understand the definition, a notation introduced 
by Dirac will make it like ‘real’ vectors. We first define 
the dual of the vector space in C? with dual or conjugate 
vectors, called bra vectors, that can either be represented 
as row vectors with complex conjugated elements, where 
a* = œ — ia etc. Following the notation introduced by 
Dirac we write this like, 


apl = Ala* + (-118*. (11.1.3) 


This somewhat strange nomenclature of bra and ket vec- 
tors makes more sense once you realize that they allow 
you to make a bracket, and this bracket is nothing but a 
scalar product of two vectors. 


The inner product The scalar (or inner, or dot) product 
maps a bra-and-ket-pair into a complex number (the scalar). 
So if we have two state vectors |p) and |p) = y|1)+5|—1) 

then their bracket is defined as 


(Php) = bl)" = y*x + ôB. (11.1.4) 


As the components of the state vectors are complex, the 
dot product of two vectors is also, and it is thus no longer 
true that it equals the product of the lengths of the vectors 
and the cosine of the angle between them. But, just like 
in the real case, we call two vectors whose dot product 
vanishes orthogonal or perpendicular. Similarly, the inner 
product of a vector with itself, which is always a real num- 
ber, is defined as the length squared of that vector. 


Probability amplitudes. It turns out that the dot product 
of state vectors has an important physical interpretation as 
a probability amplitude, and it plays a fundamental role if 
we are going to talk about quantum measurements. We 
will discuss this extensively later in this chapter, but it is 
useful to preview here already the basic idea. Let us look 
at Figure ll.1.7, where we have a state |p) , and if we want 
the outcome with respect to the blue {|1),| — 1)} frame, 
then the probability to find the outcome +1 would be the 
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probability amplitude squared: 


Par =p = GIT) (Thp) = of x= |x? (11.1.5) 


This assignment of a probability to the inner product of 
two state vectors is called the Born rule, after Max Born, 
the quantum pioneer who proposed the probability inter- 
pretation of quantum mechanics. It is also referred to as 
the Kopenhagener Deutung, or Copenhagen interpreta- 
tion. Clearly a similar calculation for the —1 outcome would 
give the probability p_; = |B|? . The normalization of the 
state vector is just the statement that the total probabil- 
ity for finding one of the two possible outcomes is one: 
p+1 + p-1 = 1. Making a measurement means that we 
get new information on the state and that affects the prob- 
abilities for the measurements after that. This means that 
the state vector has to change, because it has to reflect 
the probabilities of measurement outcomes at any instant. 
In this simple example the following happens, if we obtain 
+1 the state will change to the plus one state: hp) — |1). 
So the state gets ‘projected’ on the state, which gives that 
measurement outcome with unit probability. This you can 
interpret as saying that if you measure a quantum system 
and find a certain outcome, then if you repeat the mea- 
surement immediately afterwards you will find the same 
outcome. 


Projectors. There is an alternative way to read equa- 
tion (11.1.5). One needs to first look at the object, 


Py = |1) (11; (11.1.6) 


this is not an inner product, but rather a projector on the 
state |1). If this operator acts on an arbitrary state hp), it 
produces the projection equal to (1|1p), along the |1) basis 
vector: 


Py ftp)  |1)(Thp) . 


So the probability to find an outcome +1 is also obtained 
by ‘sandwiching’ the projector P4 in the state |i): 


pit = (pl Pr hp). 


-|+> 


-|1> 


Figure Il.1.9: Two frames. We have depicted two different 
frames for the two-dimensional qubit Hilbert space. The blue 
basis consists of the states {| + 1)}, whereas the green basis 
consists of the states {|+)}. 


These probability and measurement definitions will be used 
extensively in the next chapter. 


A frame or basis 


It is convenient to choose an orthonormal frame consist- 
ing of unit length, mutually orthogonal basis vectors that 
‘span’ the vector space. This amounts to choosing a set of 
basis vectors |i) with i = —1,1, which have the property 
that: 


(ilj) = by, 
with the Kronecker ‘delta’ symbol defined as follows: ôi 
equals one if i = j , and equals zero otherwise. 


(1.1.7) 


Note that if you think of the qubit as a spin, then the states 
with spin up or down point in parallel but ‘opposite’ direc- 
tions in real space but they are represented by two orthog- 
onal vectors in the state space of the quantum spin. The 
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state space picture therefore looks similar to that of the 
two polarizations of a photon, which are also in real space 
orthogonal. Yet there remains an essential difference, the 
qubit is what we call a spinor while the photon is a real 
vector. Note also that there are many choices of frame 
possible, for example the states |+) and |—) also form an 
orthonormal frame, as is depicted as the ‘green frame’ in 
Figure Il.1.9. 


The linear superposition principle 


The expression (Il.1.2) is an expansion of the state vector 
|W) in an orthonormal basis {|i)}. This a general rule: for 
any state vector in any D—dimensional Hilbert-space and 
any choice of basis one may write: 


[W) =P om |i), (11.1.8) 


where once more the œ; are the components of the state 
vector in that particular basis. This linear superposition 
principle is a general property and is a consequence of 
the fact that the Hilbert space of quantum states is a vector 
space. Any linear combination of state vectors is (after nor- 
malization) again a possible quantum state. It follows from 
there also that any state can be expanded in a complete 
set of basis vectors, a property we have used above. 


We can now show what it means to say that a state vector 
|W) has unit length by writing: 


(WM) = Luo ou (jit) = Lily? =1. (11.1.9) 


With what I just said, you may get worried about the Hilbert 
space for a real particle, because already in one dimension 
the configuration space is a line, corresponding to a con- 
tinuum of classically allowed positions. But how then can 
you ever build a vector space of that continuous collection 
of points? That space has to be infinite-dimensional for a 
start. 


Yes indeed, but in fact this can be done in a rigorous way! 
Our mathematical friends have shown that the space of 
functions on configuration space of the system is exactly 
the infinite-dimensional (!) Hilbert space of the type one 
needs to describe a particle with. The particle states cor- 
respond to functions on the classical configuration space, 
and as you may have guessed these are the famous wave- 
functions quantum people always talk about, the functions 
we introduced in Chapter 1.4. The functions have to sat- 
isfy the additional condition that their squares are normal- 
izable, so that they can be interpreted as probability densi- 
ties. We will explore quantum states for particles and fields 
in more detail in Chapter II.5. 


Ultimate simplicity: a single state system? 


SIEEN 


Zanlah Let us make a small detour and imagine 

KY for a moment that you were to ask the 
"S A silly question about what the quantum 
theory would look like for a system that 
has only a single state. A particle that only can be 
in one point. Should we waste our time with such a 
thing, which seems worse than thinking about how 
many angels can dance on the point of a needle, 
as the great theologian Thomas Aquinas appears 
to have worried about in the 13th century. 


The quantum formalism would then say that this 
pin-point particle has a one-dimensional Hilbert 
space, so there is only one complex state vector 
that has to be normalized to one. It would look like: 


hp) = a0) with ja? E a 


There is only one phase and that phase is an overall 
phase which is not observable, as it drops out of the 
only possible probability amplitude (0|0) = 1, and so 
that finishes off the subject. 

Except if we allow ourselves a minute amount of 
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freedom, maybe then.... 

So, let us imagine that this single state system 
represents the ground state of some real physical 
medium, and furthermore that possible other states 
in that medium have much higher energy, unreach- 
able for the system all by itself, after all where would 
the energy come from? And if it were to jump up 
spontaneously by some quantum magic, it would 
plunge down instantly anyway. So we have a one- 
state Hilbert space for this system that corresponds 
to its ground state. 

Now the critical readers are supposed to scratch 
their head and ask whether it is permitted to have 
two chunks of that material, both in that same 
ground state, but of course each with its own ‘un- 
observable’ phase. And they ask me: Sir, are two 
unobservable phases not a bit too much of obscu- 
rity? After all, what does overall phase mean in this 
context? Aha! Your point is well-taken. Two chunks 
making one system have one overall phase, but that 
leaves us with exactly one relative phase. But what 
is that good for, | may ask you in return. The puz- 
zling point is indeed that we have two exactly iden- 
tical pieces of exactly the same material, and we 
know all there is to know about them. There is noth- 
ing we can learn about them by making more mea- 
surements. 

Well, let us sit back for a moment, and try to imag- 
ine some classical situations that are vaguely simi- 
lar. | have two big chunks of material and | only talk 
about one variable, say temperature. There happen 
to be no thermal fluctuations because the material 
has infinite thermal conductivity! What you suggest 
is that we put one chunk in the freezer, and the other 
we keep exactly at room temperature. Each chunk 
in its own habitat is boring and stupid and nothing 
happens. But imagine we bring them out in ther- 
mally isolated boxes and put them on the table, and 


then take away the isolation at two facing sides and 
move them quite close. Sure enough the tempera- 
ture difference will have an effect and heat will start 
flowing from the hot chunk to the cold chunk. In 
spite of the gap in between, there will be a thermal 
flow which is caused by the temperature difference. 
After this poor classical analogue (poor, because 
the temperature (difference) is of course a directly 
measurable observable for the individual subsys- 
tems), we rush back to our quantum chunks each 
with their own quantum vacuum phase angle. What 
we did pick up is the idea that we should bring them 
close together and see what happens. 


The Josephson junction J 
Often things don’t have to be complicated to 
be interesting. What | am telling you is basically 
the story of the Josephson junction, referring to an 
effect that explains that having two slabs of super- 
conducting material in the same superconducting 
ground state, but with different phase angle, one 
can indeed obtain a ‘tunneling current’ from one 
piece to the other! This is a truly remarkable 
physical effect, entirely due to the phase difference 
of two one-dimensional Hilbert spaces describing 
the same ground state. In spite of the fact that the 
slabs are not touching, they may quantum interact if 
you bring them close. And that quantum interaction 
turns the phase-difference into an observable. 

So, how can we understand this more precisely us- 
ing the Schrédinger equations for this system? We 
have two parts to the system with wavefunctions 


hii) = <e"10). 


The state is just the lowest state and is constant 
over the sample, and taking the inner product gives 
the Cooper pair density, the normalization is there- 
fore that (pi}p;) = (0|0) = n, because the phases 
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cancel. This state itself is a rather non-trivial affair 
but that doesn’t concern us here. We just have a 
well-defined single state. If there is no coupling be- 
tween the two pieces of super-conducting material, 
then this is the end of the story. The situation is 
completely static. We find ourselves talking chunks 
of superconducting material, in which nothing 
happens as long as you stay below the energy 
needed to break up a Cooper-pair. 


Insulator 


Superconductor A ==> Superconductor B 
= te 


Vg 
0g 


(a): Josephson junction. Two ‘identical’ slabs of su- 
perconductor with an insulating layer in between. The 
ground states have few parameters, a homogeneous 
charge density ne ~ hp]? a Potential V, and a phase 
angle 0. 


Then life gets simple again, effectively it only has 
an angle which is hidden and does not really count 
as a degree of freedom. Trivial! So thats why we 
discuss it here as a case of ultimate simplicity, it 
really is less than a single particle, less even than 
a qubit! 


But, imagine we bring the two pieces very close, 
then the wavefunction will decay exponentially 
outside the the space in between the two pieces, 


so once they are very close they can interact 
quantum mechanically, but not classically, the 
insulating material in between acts like a high 
potential barrier. Yet, the two pieces interact, which 
means that there is some weak coupling w. This 
situation is depicted in figure (b) and the interaction 
leads to 


(b): Charge density. Potential landscape (purple curve) 
and charge density (red curve). The potential is mini- 
mal in the slabs, so the charges (Cooper pairs) are well 
confined. But at the boundaries of the slabs the wave- 
function will decay exponentially, also on the insulator 
side, so if the insulator gap is narrow enough the wave- 
functions of slab A and B will overlap and represent an 
interaction. 


cross terms in the equations as follows: 


dja) 
dt 


ih = eVa hpa) + whpp) , (1.1.10) 


and a corresponding equation for |t»g) with the 
same Vg and a term —w)pa) . 
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A DC current. We start by considering the case 
with Va = Vp . Now we don’t have to solve the prob- 
lem in all detail as we mainly want to know what the 
effect of the interaction on the charge densities is. 
We know that the electric current(density) J is de- 
fined as the time derivative of the charge (density), 
where the charge density is just p = e(wpalwa), 
with —2e the charge of a Cooper-pair, 


dp _ £ d(pahpa) l 


EEan 
dt dt l ) 


j= 
The right-hand side of this equation can be directly 


calculated from, 


d(palpa) _ dpal 
dt dt 


dja) 
dt 


ba) + ball Ie 
After substituting the right-hand side of the equa- 
tion (Il.1.10) and its mirror we arrive at the following 
expression for the current: 


—iew 2ewn 


A ((hahbs)—(Mephba)) = 


‘i 


Defining the phase difference © = Og — Oa , we ob- 
tain that 


J=Josin@ with Jo =2ewn/fh. 


This is a stunning result! Apparently there is a 
DC current flowing through the junction without 
any potential difference, the current is basically 
driven by the phase difference between the two 
superconducting slabs! 


An AC current. There is another important equa- 
tion, which follows if we now in addition apply a volt- 
age across the barrier. Then Va Æ Vg , we can just 
solve for the phase difference 0 to obtain 

dO 2e 


— 2 1.4.12 
em ( ) 


sin(@p—8a ) . 


where V equals the potential difference 
V = Vg — Va. So we see that if we apply a 
voltage over the junction, the current becomes an 
AC current. This Josephson junction is a quantum 
device that has the remarkable feature that the 
frequency of the current measures the voltage! 


The power delivered to the junction. Now the 
amount of energy is the power delivered to the junc- 
tion over time, where the power is equal to the prod- 
uct of the current and the applied voltage J V. We 
can write this in terms of our fundamental angular 
variable: 


t A ot) 
ue) =| jiva- 2 | ging dé, 


+ U(d) = a — cos 9). WATS) 


We find that this energy is periodic in the phase 
difference, which is not so surprising if you realize 
that the whole setup is periodic from the start. 
Yet, to get to a more complete understanding we 
should take another contribution to the energy into 
account. 


The charging energy. You can think of this 
junction as a (super) capacitor, with two (super- 
)conducting plates and an insulator in between. We 
have an AC current J(t) going through, so that a 
charge Q(t) and a related voltage V(t) will build 
up on the capacitor. The defining relation for the 
capacity C of the capacitor is Q = CV, and C is a 
property of the junction which does not depend on 
time. 


There is a charging energy Ug that builds up in the 
capacitor, which is given by the time integral, 


u a. ee 11.14.14 
os| vagsz 014 
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A mechanical analogue. Think of the total energy 
function as a Hamiltonian 


H(Q,0) = =Q? | I cos 1. 


with of course also the relation, 


AC do 
Q=(V= T 

This reminds us of a simple particle Hamiltonian 
where the first term is the like the kinetic energy 
proportional to the momentum squared (the veloc- 
ity being d@/dt), and the second like a potential 
energy. It describes a particle running around on 
the unit circle with an (angular) momentum Q pro- 
portional with the angular velocity d0/dt in a nice 
periodic potential U(@). This particle has a mass 
proportional to C and the strength of the potential is 
proportional to Jo . One can now check with the ma- 
terial we discussed in Chapter I.1, with p = —Q and 
q = 02e/ħ, that (i) the dynamical equations corre- 
spond with the equations we derived for J and V, 
and (ii) that the total energy is indeed conserved for 
this mechanical system. 

So this, in essence, basic quantum system, could 
in the end be mapped to a familiar classical system, 
where one can effectively apply one’s good old 
Newtonian mechanics skills and intuitions. 


This closes our Josephson-junction detour. Now 
you can appreciate the remarks we made in 
Chapter!.2 on units, equation (Il.1.12) displays a 
direct relation between a frequency and the ratio 
of two universal constants which is by the way the 
fundamental unit of magnetic flux, Do = h/2e. You 
can use this relation to measure that ratio, but also 
the other way around, knowing that ratio you can 
measure voltages extremely accurately. Indeed, 
this Josephson junction has many generalizations 


called Josephson’s effects with ample applications. 


This answers the so-called ‘silly’ question we 
started off with. The answer is that by introducing 
the interaction between two ‘trivial’ systems they 
become one, and there is only one unobservable 
phase left, while the other, relative, phase becomes 
a dynamical variable and acquires physical mean- 
ing of the utmost importance. 


This intermezzo illustrates in my opinion something 
interesting about doing physics: it is not always a 
matter of taking as much as possible into account, 
but rather, it is rather that after stripping the problem 
back to its minimal form that essential insights are 
obtained. In other words, my advice to the alert 
reader would be: keep pestering your teachers 
with silly questions, because as you see, they may 
not be so silly after all and may lead to stunning 
answers! 


By the way, it was the the Welshman Brian Joseph- 
son, who won the physics Nobel prize in 1973 at 
33 years of age for the discovery of what is now 
called the Josephson junction, which is in essence 
the system we just described. He did the work in 
Cambridge as a student at the age of 22. In other 
words, we are never too old to learn and never too 
young to make a difference! We will come back 
into more detail to these matters in Chapter III.3 
on condensed matter physics in Volume Ill of the 
book. E 
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massive particle 
with spin degree of freedom 


photon with polarization 
degree of freedom 


trapped ion or atoms 


quantom dots 


Figure II.1.10: Qubit realizations. Four possible qubit realiza- 
tions: (i) an atom or particle that carries spin one-half like the 
electron, (ii) the photon, (iii) particles trapped in optical lattices 
having two well-separated levels, and (iv) spins in quantum dots. 


Qubit realizations 


Any well-defined two level quantum system can be thought 
of as representing a qubit. This could also mean we re- 
strict ourselves to a subset of two specific states of a more 
elaborate quantum system. Examples of two state quan- 
tum systems are: 

(i) a particle that carries half a unit of spin like the elec- 
tron, the proton or neutron. These possess two basic spin 
states. If we measure its spin along any direction, we al- 
ways find either spin ‘up’ or ‘down’. This spin-1/2 property 
basically has no classical analog; we have introduced its 
discovery and its meaning on page 161 of Volume I. 

(ii) a photon with a fixed frequency, which possesses two 
basic polarization states. The photon can oscillate in any 
direction perpendicular to its direction of motion, and as 
the photon necessarily moves with the velocity of light and 
just cannot be put to rest, this frame is always well de- 
fined. The polarization state can always be decomposed 


into two perpendicular basis states, say ‘horizontal’ and 
‘vertical’. We can arbitrarily designate one quantum state 
as ‘spin up’, represented by the symbol | + 1), and the 
other ‘spin down’, represented by the symbol | — 1). We 
illustrated some typical polarization states of a photon in 
Figure Il.1.11. If both components are in phase with each 
other, we say that the photon is linearly polarized. If they 
are out of phase we speak of circular or elliptically po- 
larized light, where we distinguish ‘left-handed’ or ‘right- 
handed’ polarization. 

A photon is a qubit that necessarily travels with the speed 
of light. If we generate an electromagnetic wave, what we 
really do is making a beam of photons, and depending on 
the type of source, this beam maybe polarized or unpolar- 
ized. But if we make an ultra-short light pulse, it is possible 
to only produce a single photon. 

(iii) A particle (say atom or molecule) in one of two lowest 
energy states which are well separated from the rest of the 
spectrum of states. A well-known example is the trapping 
of ions in an optical lattice. 

(iv) In quantum dots it is possible to individually manipulate 
spin carrying degrees of freedom such as polarized elec- 
trons, and therefore these can in principle be assembled 
into quantum information processing devices. 


Entanglement 


It is in multi-particle and multi-qubit states that some of the 
most counter-intuitive and powerful aspects of quantum 
theory surface: in particular the notion of entanglement. 
In Figure II.1.12 we give a ‘state of the union’, a schematic 
overview of the multi-qubit type of states and how they are 
related. This schematic summarizes the content of this 
section. 
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|v> = |+1> |h> = |-1> 


|+> = |+1> + |-1> 


—— 


|r>= |+1> + iļ-1> 


|-> = |+1>- |-1> 


--- — 


|I> = |+1>-i]-1> 


Figure II.1.11: Photon polarizations. Polarization states of the 
photon decomposed into the standard basis vectors |+ 1 > and 
|—1 > . The top four are linearly polarized states, while the bot- 
tom two are circularly polarized. The polarizations in the three 
lines correspond to the eigenstates of the basic qubit observ- 
ables, Z, X, and Y which will be defined after equation (II.2.2). 


Multi-qubit states 


A quantum computer needs systems of multiple qubits, 
called quantum registers. You may think of an array or net- 
work of n particles, each with its own spin. (As stated be- 
fore, the formalism does not depend on the precise imple- 
mentation, and it is possible to have examples in which the 
individual qubits correspond to degrees of freedom other 
than spin). The quantessence doesn’t talk about how the 
qubits are realized, but about their underlying structural 
properties. The mathematical space in which the n qubits 
live is the tensor product of the individual qubit spaces, 
which we write as C? 8 C? @...@ C? = C?" . For example, 
the Hilbert space for two qubits is C? @ C?. This is a four- 
dimensional complex vector space spanned by the vectors 
1) @ 11}, |—1) @ 11), 11) @|— 1), and |—1) @|—1). 
So tensor products are not about multiplying numbers or 
functions, but about multiplying spaces, where the product 


refers to the dimensions: the product of an m-dimensional 
and a n-dimensional space gives an (m x n)-dimensional 
space. So multi-qubit states live in an exponentially larger 
state space (d = 2"). For convenience we will often ab- 
breviate the tensor product by omitting the tensor product 
symbols, or by simply listing the spins. For example 


11) @|—1) =11)| 


Sit). 


The tensor product of two qubit states with state vectors 
hp) = all) + B] — 1) and |p) = yll) + ôl — 1) is the 
state 


hp) Slp) = hb)ld) = 
= axy|1, 1) + «41, 


1) + Byl—1,1) + Bd|—1,—1). 


An basic feature of the tensor product is that it is distribu- 
tive, i.e. (y[1)+8|—1)) @hb) = yl1) @hp) +8|—1) @hp) . We 
emphasize once more that whereas the classical n—bit 
system has 2" states, the n—qubit system corresponds to 
a vector of unit length in a 2"-dimensional complex space. 
It is a continuous space in fact a complex hypersphere. For 
example a three-qubit can be expanded as: 


hp) = eal, 1,1) oll, 1 —1) + wall, — 1,1) 
+ og4|—1,1,1) + &5|1,—1,—1) + s| — 1,1, —1) 
T &7| = 1,—1, 1)ægş|— 1,—1,—1). 


As before it is convenient to denote the state vector by the 
column vector of its complex components «1, X2, ..., X2n . 


When dealing with multi-qubit states, we have to make 
clear distinctions between various types of states. These 
are important in discussions to come later, yet | want to 
present them here all at once, without elaborating too much 
on their specific roles yet. It is nice to compare them and 
contrast them with each other. First of all there are the so- 
called pure states and those are the states we have been 
talking about so far. The pure multi-qubit or multi-particle 
states break up into two types, the separable and the en- 
tangled states. The notion of entanglement and its dra- 
matic physical implications are the subject of Chapter II.4, 
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dephasing 
(decoherence) 
separable | 
states | 
= a reversible 
° gates (CNOT) 
‘multi-qubit entangled 
states 


| states 


partial tracing 
(decoherence) 


Figure II.1.12: Multi-qubit states. An overview of the types of 
multi-qubit states and the relations between them. 


mixed states 


about the Einstein—Podolski-Rosen paradox and quantum 
teleportation. 


If we talk about realistic quantum systems that couple to 
some environment or ‘classical’ measurement device, we 
often have to deal with states that are not pure but mixed 
states. To deal with both pure and mixed states it is con- 
venient to introduce the density operator, which provides a 
unified framework for all types of states. This concept was 
introduced by Von Neumann in the early days of quantum 
theory as an alternative to the wavefunction or state vec- 
tor approach. These are the topics that | will focus on in 
the remainder of this section. 


Entangled states 


When two systems, of which we know the states by 
their respective representatives, enter into tempo- 
rary physical interaction due to known forces be- 
tween them, and when after a time of mutual in- 


fluence the systems separate again, they can no 
longer be described in the same way as before, 
viz. by endowing each of them with a representa- 
tive of its own. 
| would not call that one but rather the character- 
istic trait of quantum mechanics, the one that en- 
forces its entire departure from classical lines of 
thought. By the interaction the two representatives 
[the quantum states] have become entangled. 

E. Schrödinger, 1935 


Entanglement is a direct consequence of the linear su- 
perposition principle applied to multi-qubit or multi-particle 
states. If qubits are entangled this means that successive 
measurement outcomes on the two qubits will be highly 
correlated, implying that quantum theory is fundamentally 
non-local. 


The quantum states of systems consisting of spatially sep- 
arated components (e.g. two particles) can be entangled, 
which implies that they can no longer be treated indepen- 
dently and therefore measurements made on one can have 
instantly consequences for the other! This is indeed a 
quantessential feature of reality that dramatically departs 
from the classical description of such a system. It is this 
‘entanglement’ property that lies at the root of a zoo of so- 
called quantum paradoxes, such as Schrédinger’s cat and 
the EPR paradox and more generally the quantum mea- 
surement problem. But it is also essential for understand- 
ing the Bell inequalities which pose a rigorous quantita- 
tive bound on classically allowed correlations; bounds that 
have been observed to be violated in quantum systems. 
Entanglement furthermore plays an essential role in fash- 
ionable and promising subjects like quantum teleportation. 
We return to these topics in Chapter II.4. In this section we 
will merely touch on some of these aspects. 
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Figure 1.1.13: Bohr and Einstein in one of their debates. 
(Source: (Photo made in 1930 by Paul Ehrenfest, courtesy AIP 
Emilio Segré Visual Archives) 


Schrodinger’s cat 


When we have more than one qubit an important practi- 
cal question is when and how measurements of a given 
qubit depend on measurements of other qubits. Because 
of the deep properties of quantum mechanics, qubits can 
be coupled in subtle ways that produce consequences for 
measurement that crucially differ from classical bits. Un- 
derstanding this has proved to be important for questions 
relating to quantum computation and information transmis- 
sion. To explain this we need to introduce the opposing 
concepts of separability and entanglement, which describe 
whether measurements on different qubits are statistically 
independent or statistically dependent. 


This notion of entanglement as a necessary consequence 
of the quantum postulates led to the infamous problem of 


Schrédinger’s cat. This problem was well described by 
Schrödinger himself: 


‘L..]JMan kann auch ganz burleske Fälle konstru- 
ieren. Eine Katze wird in eine Stahlkammer ges- 
perrt, zusammen mit folgender Héllenmaschine (die 
man gegen den direkten Zugriff der Katze sichern 
muss): in einem Geigerschen Zahlrohr befindet sich 
eine winzige Menge radioaktiver Substanz, so wenig, 
dass im Laufe einer Stunde vielleicht eines von 
den Atomen zerfallt, ebenso wahrscheinlich aber 
auch keines; geschieht es, so spricht das Zahlrohr 
an und betatigt Uber ein Relais ein Hammerchen, 
dass ein Kdlbchen mit Blausaure zertrummert. [...] 


and* 


[...] Hat man dieses ganze System eine Stunde 
lang sich selbst Uberlassen, so wird man sich sa- 
gen, dass die Katze noch lebt, wenn inzwischen 
kein Atom zerfallen ist. Der erste Atomzerfall wür- 
de sie vergiftet haben. Die Psi-Funktion des gan- 
zen Systems würde dass so zum Ausdruck brin- 
gen, dass in ihr die lebende und die tote Katze 


Šit is also possible to construct very burlesque fables. A cat is locked 
into a steel chamber, together with a poisoning contraption consisting 
of a hammer and a flask (which must be secured against direct access 
by the cat): and a Geiger counting tube containing a minute amount 
of radioactive substance, so little that in the course of an hour perhaps 
one of the atoms breaks up, but equally probably none; if it happens, 
then the counting tube responds and, via a relay, releases the hammer 
that crushes a little flask with blue-acid. 

41...) After one has left this whole system for an hour, one will say 
that the cat is still alive if no atom has decayed, as the first atomic 
decomposition would have poisoned it. The wavefunction of the whole 
system would thus express the fact that in it the living and the dead cat 
are mixed or smeared in equal parts. That an indeterminacy confined 
to the atomic realm translates into indiscernible indeterminacy, which 
can then be removed by direct observation. This prevents us, in such a 
naive way, from considering such a washed out model’ as an image of 
reality ... 
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Figure Il.1.14: Schrédinger’s cat state. Artist impression of a 
quantum cat in the state: peat) = |alive) + |dead) . (Source: 
JSTOR Daily.) 


(s. v. v.) zu gleichen Teilen gemischt oder ver- 
schmiert sind. Das Typische an solchen Fallen 
ist, dass eine ursprünglich auf den Atombereich 
beschrankte Unbestimmtheit sich in grobsinnliche 
Unbestimmtheit umsetzt, die sich dann durch di- 
rekte Beobachtung entscheiden lässt. Das hindert 
uns, in so naiver Weise ein “verwaschenes Modell” 
als Abbild der Wirklichkeit gelten zu lassen... 


A cat in our classical world can either be dead or alive, 
and taking this quantum assumption to its logical extreme, 
this cat could in principle be in a state that it is a linear 
superposition of ’alive’ and ‘dead’. This property of quan- 
tum mechanics is simple to spell out but is radically differ- 
ent from the way we talk about physical states in classical 
physics. This difference derives directly from the quantes- 
sential principle that allows us to consider linear superpo- 
sitions of states, which therefore seems problematic from 
the start. 


The cat sits in a closed box with some food but also with 


a lethal contraption consisting of a small quantity of a ra- 
dioactive substance, or a single metastable atom for that 
matter. If that atom decays, it emits a photon that hits a de- 
tector which subsequently triggers a device which breaks 
a little capsule filled with a poisonous gas that in turn will 
kill the cat. This unfortunate scenario suggests that in 
this situation the states of the atom labeled |decayed) or 
Inot decayed) are entangled with the states |alive) or 
|dead) and we write: 


Wat) = not decayed) ® alive) +|decayed) ® |dead) , 


because the other states in the atom & cat state space 
have zero coefficient, and we have assumed that both terms 
are equally probable. What the formula above expresses is 
that the undetermined state of the atom is entangled with 
the states of the cat. 


It seems a far-out proposal of a fundamental theory of na- 
ture to take such states seriously. At the heart of this prob- 
lem lies the following question: if quantum mechanics is 
the underlying reality of everyday life described by the laws 
of classical physics, then it should be possible to under- 
stand these classical laws from the quantum laws. There 
may be no logic that leads you from classical to quan- 
tum theory but there should be a derivation of the laws of 
classical physics starting from the quantum laws, because 
classical physics is just an approximation of quantum phys- 
ics, and such approximations should be well understood. 
We should compare this to how classical Newtonian me- 
chanics can be obtained as the limit of relativity where we 
send the speed of light c to infinity. The analogy suggests 
that in quantum theory we just have to send Planck’s con- 
stant h to zero, and yes, in many cases this is what we 
have to do, but such direct approaches do not resolve is- 
sues like Schrédinger’s cat. The question has lead to nu- 
merous deep philosophical arguments among physicists 
and philosophers right from the inception of quantum me- 
chanics in the beginning of the twentieth century, and we 
will return to ‘Schrédinger’s cat’ in later chapters. For the 
moment we just want to give a more accurate definition of 
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the different types of states for simpler systems. 


Entangled vs separable states 


Let us now turn to precise definitions of multi-qubit states. 
The n-qubit state is called separable if it can be written as 
a single product of n single-qubit states, i.e. if it can be 
written as n — 1 tensor products of sums of qubits, with 
each factor depending only on a single qubit. An example 
of a separable two-qubit state is 


pb) = 5 (01,1) + 11,1) +1-4,1) #1-1,-1)), 
because it can be written like 
hp) = 511) +1-1)) @ (11) 41-1) =H). 


If an n-qubit state is separable, then measurements of in- 
dividual qubits are statistically independent, i.e. the proba- 
bility of outcomes of a series of measurements of different 
qubits can be written as a product of probabilities of the 
measurements for each qubit. These outcomes are un- 
correlated and the overall outcome is therefore indepen- 
dent of the order in which these measurements are per- 
formed. 


If an n-qubit state is not separable, then it is per definition 
entangled. An example of an entangled two-qubit state 
is, 

o 
-v2 
which indeed is a linear superposition which cannot be fac- 
tored into a single product. If we have a pair of qubits in 
an entangled state, subsequent measurements of the in- 
dividual qubits do depend on each other. If you first make 
a measurement on the first bit, then that measurement will 
instantaneously affect the two-bit state and possibly the 


hp) (11,1) l= 1,—1)), (11.1.15) 


5Strictly speaking this is only true for pure states, which we define in 
the next section. 


1 


(1) 1 =4)) © (11) + |= 2) 


N 


bole t 


(11,1) +|1,—1) +|- 1,1) +] -—1,-1)) 


Figure II.1.15: Separated pair. The state vector for the pair is 
a product of the individual state vectors. 


state of the other bit, even if that is spatially arbitrarily far 
away. The measurement thereby influences a later mea- 
surement outcome of the second bit. Now the use of that 
word ‘instantaneous’ should make you feel uneasy in view 
of the theory of relativity, and correctly so. Some great 
physicists — like Einstein to mention one — felt the same 
way and preceded you. This thought-provoking question 
unleashed a deep, but also longwinded debate about the 
foundations of quantum theory, already among its foun- 
ders. 


Let me illustrate this point for the examples we gave above. 
Suppose we do an experiment in which we measure the 
spin of the first qubit and subsequently measure the spin 
of the second qubit. For both the separable and entan- 
gled examples, there is a 50% chance of observing either 
spin up or spin down on the first measurement. Suppose it 
gives spin up. For the separable state this transforms the 
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Figure II.1.16: Entangled pair. Artist impression of what an 
entangled state of a pair would look like. It gives at least a feeling 
for it, maybe more so than the formula below telling you exactly 
what it is. 


state vector as follows: 


m+ -e (1) +1-1)) 


1 1 

> pr? (11) +|—1)) = Z 

it means that the measurement projects the initial qubit 

state hp) on the first line onto the state |p’) in the sec- 

ond line. One may verify that the probability amplitude in 

analogy with equation (II.1.5) equals |(1p’hp)|/? = 1/2 as it 

should. The same probability would have resulted for the 
spin down measurement.° 


(11,1) +11,—-1)), 


So only the |1) component of the first qubit survives after 
the measurement. If we now measure the spin of the sec- 
ond qubit in the state |p’), the probability of measuring 
spin up or spin down is still 50% . And as mentioned be- 
fore, the previous measurement on the first qubit has no 


We will deal with the observables and measurements more exten- 
sively in the next chapter. 


effect on the second measurement. As we have already 
noted, it is a generic property of separable states that sub- 
sequent measurement outcomes on individual spin states 
are independent, and the outcomes do not depend on the 
order in which we perform the measurements. 


Let us now consider a similar experiment on the entangled 
state of equation (Il.1.15) and observe spin up in the first 
measurement. This changes the state-vector to 


P 
v2 


(Note the ‘disappearance’ of the factor 1/v2 due to the 
necessity that the projected state vector remains normal- 
ized). If we now measure the spin of the second qubit, we 
are certain to observe spin up! Similarly, if we observe spin 
down in the first measurement, we will also see that in the 
second qubit with 100% certainty. For this entangled ex- 
ample the measurement outcomes are completely corre- 
lated — the outcome of the first completely determines the 
second, and the state is therefore called maximally entan- 
gled. As this also holds for entangled qubits which are light 
years apart, this instantaneous effect on the state implies 
a puzzling if not bizarre form of non-locality in the quantum 
world that at first sight appears to violate causality. 


hp) (1, +|—1,—1)) > hp’) = 11,1). (11.1.16) 


Bertimann’s socks. There has been a debate among 
physicists like John Bell and others about what it is that 
sets quantum entanglement really apart. The conundrum 
goes by the name Bertlmann’s socks. Mr Bertlmann, a 
real-life early collaborator of Bell at CERN, happens to al- 
ways wear socks of a different color. So, Mr Bertlmann, 
whose socks have risen to eternal fame, constitutes a sys- 
tem which has the unusual property that if you get to see 
one of his socks to be ‘red’ for example, then instanta- 
neously you are able to conclude that the other sock has 
the property ‘not red’. So here is a form of non-locality. You 
measure one sock and are hundred percent sure about a 
property of the other sock which is elsewhere. So, the con- 
ditional probability given sock #1 is red, for sock #2 to be 
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‘not red’, is one. Is that not a classical version of quantum 
entanglement? It looks like it, the states of the socks are 
highly correlated indeed, and knowing the state of the first 
affects the probability distribution for the other. It doesn’t 
change the socks or their color: it just affects the prob- 
ability of measurement outcomes. And there is nothing 
unusual, absurd or stunning about that. It is very much 
true that the state of the socks is not affected. There is no 
signal exchanged between the socks, since they are ina 
definite state which is there to stay. 


The quantessence of entanglement. The quantum catch 
is that there is one additional feature in the quantum frame- 
work that has no classical analogue and which sets the 
EPR paradox apart. In the qubit experiment there is an ad- 
ditional freedom for making the measurements, one is free 
to choose the frame or polarization of a measurement. In 
the example of the entangled state given in (Il.1.16), we 
could have chosen the measurement for the first qubit not 
in the (\1),|—1)) frame, but for example of in the (|+), |—)) 
frame. Then, given the outcome of that measurement for 
example to be plus one, we know that the second qubit has 
to be in the |+) state. Keeping the measurement for the 
second qubit as before in the (|1),|— 1)) frame, the prob- 
ability to find the outcome to be plus or minus one is 50% 
for each. This dependence of the probability of the second 
outcome on the choice one makes for the first measure- 
ment is what makes the situation non-local, because now, 
dependent on the frame choice and the outcome plus one 
for the first measurement, the second qubit flips instanta- 
neously to the |1) or |+) state. And this looks very much 
like an instantaneous action at a distance, the state of the 
second is affected, and therefore causality should be vio- 
lated. Is it? 


The answer is: no! As we have already, and will explore 
more extensively in the following chapters, the quantum 
state is like a probability amplitude, which encodes a prob- 
ability distribution for measurement outcomes. Multi-qubit 
states, separable or entangled, encode all possible corre- 


Figure Il.1.17: CNOT gate. The circuit diagram is basically 
a two-qubit interaction diagram, representing the action of the 
CNOT gate on the four possible two-qubit basis states. As poin- 
ted out in Figure Il.1.5, the dot on the upper qubit denotes the 
control and the cross is the symbol for the conditional one-qubit 
NOT gate. 


lations that may or may not exist between sequences of 
measurement outcomes. And a closer look at the exam- 
ples given above does precisely that, they show how un- 
conditional probabilities, turn into conditional probabilities 
which are different indeed. And since in the quantum world 
there are basically only probabilities, the measurements of 
entangled states are easier to grasp if you think of ‘states’ 
as encoding probability distributions. We return to these 
questions in the section on the Einstein—Podolsky—Rosen 
paradox in chapter II.4. 


From separable to entangled and back 


For two qubits in a separable state to get entangled they 
need to interact somehow. In quantum information lan- 
guage that would mean that they have to be acted upon 
by some two-qubit gate. Let us take our favorite CNOT- 
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gate of Figure ll.1.5, it acts on the state |A) & |B) as fol- 
lows: 


CNOT : |A) @ |B) > |A) @|[-AB]). 


In other words, the CNOT gate flips the state of B if A = 
1, and does nothing if A = —1. 


For convenience we give the explicit action on the basis 
states in Figure ll.1.17, which allows you to verify that if 
we let the CNOT gate act on the separable state |+) @ 
| — 1) it indeed generates a maximally entangled state 
(1.1.15): 


CNOT : |+) @|—1) > 


(11,1) +|—1,—1)). (11.1.17) 


1 
v2 
Note that this gate is reversible, as one can immediately 
see from the figure, CNOT? =1. 


In fact, from an intuitive point of view the ability to gener- 
ate substantial speed-ups using a quantum computer vs. 
a classical computer is related to the ability to operate on 
the high dimensional state space including the entangled 
states. To describe a separable n-qubit state with k bits of 
accuracy we only need to describe each of the individual 
qubits separately, which only requires of the order of nk 
bits. In contrast, to describe an n-qubit entangled state 
we need of the order of k bits for each dimension in the 
Hilbert space, implying that we need of the order of 2" k 
bits. If we were to simulate the evolution of an entangled 
state on a classical computer we would have to process 
all these bits of information and the computation would be 
extremely slow. Quantum computation, in contrast, acts 
on all this information at once — a quantum computation 
acting on an entangled state is just as fast as one acting 
ona separable state. This is exactly the type of parallelism 
at the intermediate stages of computing that we referred to 
before. Thus, if we can find situations where the evolution 
of an entangled state can be mapped into a hard mathe- 
matical problem, we can achieve spectacular speed-ups. 


Mixed versus pure states 


The states we have been dealing with so far were statis- 
tically pure, or more simply, pure states. In spite of the 
quantessential uncertainties in such states, the state vec- 
tor is the most complete knowledge about a quantum state 
that is available. In real life however it may prove very dif- 
ficult to prepare a system in a pure state. After all, quan- 
tum phenomena are not that easy to detect, which means 
that pure states apparently are not so common. Somehow 
a lot of the quantum stuff gets washed away in ordinary 
life, quantum does not hit the eye, so to speak. The rea- 
son is that quantum systems are permanently interacting 
with their environment, and it is only in situations where we 
take exceptional care to protect our quantum system from 
those influences, that we can observe pure quantum be- 
havior. This is not easy; it certainly is not the case in most 
situations which arise naturally, and that is precisely why 
we perceive the world around us as completely classical. 
Turning this reasoning around you may ask that given the 
underlying world is entirely quantum, why there is such a 
thing as the classical world, and how it comes about. How 
can we understand the laundering of all that quantum ex- 
otica? This is the basic question one has to face in a de- 
tailed treatment of quantum measurement, which has to 
account for how we can start up with a quantum process 
and end up reading dials and counting macroscopic sig- 
nals like clicks, or pulses and what not. It is here that we 
have to introduce the concept of a mixed state, and con- 
trast it with a pure state whether entangled or not. And 
indeed it is often through the interaction with the environ- 
ment that states get ‘mixed up’, just like humans do. We 
have to deal with mixed up people all the time, and we 
learned to deal with that! Let us be pedantic and illus- 
trate the distinction between between a pure and a proper 
mixed state with the experimental setup depicted in Fig- 
ure II.1.18. An incoming beam is polarized and each of the 
particles is in the pure state |+) ie. with X ~ +1. Now 
we send them through a Z polarizer in (b). What we find 
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|1> 


(a) All incoming particles are in the pure state |p) = |+) . (b) The incoming beam goes through a Z polarizer 


(c) Half of the particles in the outgoing beam are in the pure state |») = (d) The other half of the particles in the outgoing beam are in the pure 
1). state |p) =|—1). 


Figure II.1.18: Mixed state. Graphical representation of how to prepare a beam of particles in a proper mixture and corresponding 
mixed state. For the incoming particles in figure (a) the density matrix is given by equation (II.1.23), in the outgoing beam (c)+(d) the 
particles have the density matrix of equation (11.1.24 ) 
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behind the polarizer is a beam of particles, but now 50% of 
the particles is in a pure state |1) (c) and the other 50% is 
in a pure state | — 1) (d). Now we could combine the par- 
ticles in single beam (without letting them interfere), then 
each of the particles is still in a pure state, but the beam is 
now a statistical mixture of particles in | + 1) states. The 
beam represents a classical ensemble of particles that is in 
a mixed state. This is called a proper mixture to be distin- 
guished from the improper mixture to be discussed shortly. 
In the present situation there is a non-zero probability for 
the particle to be in any one of two pure states. So picking 
out one particle there is a 50% chance it is in a pure up- 
state and a 50% chance it is in the down-state, but — and 
this is crucial — it is not a state corresponding to a linear 
superposition of the up and down-state, that would just be 
the ‘plus’ state, |+) ! 


This mixed state is a classical statistical mixture and not a 
quantum superposition. What makes this setting a bit con- 
fusing is that we now have two types of probability to keep 
track of, the quantum probabilities we have been talking 
about so far and in addition the probability distribution of 
the classical ensemble. 


To further clarify this notion, let me point out that a naive 
but wrong way is to write for the state of the particle some- 
thing like |mix) = >; pillbi > . This looks dangerously 
close to the usual expansion of a pure state into a cer- 
tain basis hp) = >, axl, > . But there the coefficients 
are probability amplitudes œp leading to probabilities pp ~ 
lax|?. To put it differently, with the troublesome trial nota- 
tion | just proposed we would get that the expectation or 
average value’ of an observable A in a mixed state |W mix) 
would become (mix! A mix) ~  Pipj... , an expres- 
sion that is proportional to probabilities squared, which 
makes no sense. 


7I apologize for getting ahead of myself, as observables and their 
expectation values are to be discussed in detail in the next chapter on 
page 285. 


What we want is a weighted average over ordinary pure 
state expectation values: 


(A) = >) pa(thalAhtha) - (11.1.18) 
a 

In this expansion the states jpa) are some set of pure 
states. These don’t have to be orthogonal, so it could 
be that hþı) = |1) and |t»2) = |+) for example. It is for 
this reason that once we admit both mixed states and pure 
states it is almost imperative to use the density matrix for- 
malism because it treats both type of states on an equal 
footing. 


The density operator 


The famous mathematical physicist John von Neumann 
developed an alternative formalism for quantum mechan- 
ics in terms of what is called a density operator, which 
basically replaces the wavefunction, or state vector, right 
from the start. 


The density operator formulation, as we will see shortly, 
leads in a natural way to the definition of what is called the 
Von Neuman entropy for a quantum system. 


Proper mixtures. Consider as we just did, a mixed state 
in which there is a probability pq for the system to have 
wavefunction pa and an observable A with an expectation 
value (II.1.18). The density operator defined for a pure 
state is just the projection operator we introduced in (11.1.6) 
for that state: 
p = Pi = hi) Wil, 

and the density operator for a properly mixed state is quite 
naturally defined as: 


p=) Pahba) (thal; (11.1.19) 


which reduces naturally to the pure state case above if 
Pa = pi = 1 for a single value of i. To obtain the density 
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matrix, we have to expand this operator in an orthonormal 
basis. We start with 


P= > paa oe! h) (xxl 
ajk 
where the gf are the expansion coefficients of the pure 
state |\pq) in the {|x;)} basis. The density matrix is defined 
by the matrix elements of the density operator: 


gS pial) a 


With density matrices the notion of a trace is convenient. 
Recall that the trace of a matrix is the sum of the diagonal 
elements, so we have for example that, 


a Oe al 


because the sums over m equal one for each value of 
i. The expectation value (Il.1.18) is compactly expressed 
as: 


Pmn = (XmlPIXn) (11.1.20) 


(A) = tr (Ap), (11.1.21) 


and because the trace tr(Ap) is independent of the chosen 
basis this expression can be evaluated in any convenient 
basis, and so provides an easy way to compute expecta- 
tion values in any state. Note that for a pure state pa = 1 
for one particular value of a = i* , and pa = 0 for a # i*. 
In this case the density matrix has rank one. This becomes 
clear if we write the matrix p in a basis in which it is diago- 
nal, because then there will only be one non-zero element. 
When there is more than a single non-zero value of pa it is 
a mixed state and the rank is larger than one. 


Finally note that if we chose the unit matrix as the trivial 
observable we get the trace of p itself, which equals one 
by definition. This property will be used if we consider par- 
tial traces, which refer to the density matrix of subsystems, 
later on. The best way to think about the density matrix 
of a proper mixture is as a classical distribution over pure 
quantum states. It is an essential concept if we want to 


understand and describe quantum measurements in more 
detail. In particular if we are to include the measurement 
apparatus in the analysis, and want to understand how we 
get from quantum to classical physics: from a pure quan- 
tum state to a macroscopic pointer on a dial. 


To get a better feeling for how a density matrix works, con- 
sider a few simple examples of single qubit states. First 
look at a spin in a pure state with hp) = |1). The den- 
sity operator corresponds to the corresponding projection 
operator. 


p =|1)(1] & Pmn f ae 


The expectation of the spin polarization operator along the 


z-axis becomes 
1 0 1 0 1 0 
nzo = ( (4 ‘) f E eas 


as expected. Likewise we could construct the density ma- 
trix corresponding to another pure state |+) as 


pa 
ERT ge? 
Now the expectation value of Z is 


naa SAG NHC Sos 


as it should. If, however, the system is in a mixed state 
with 50% of the population spin up and 50% spin down the 


density matrix becomes 
5 1/1 0 
2\0 ge: 


In this case the expectation of the spin along the z-axis, 
which is tr(Zp) , is zero again as it should be, because the 
probability for a particle in the mixed state to contribute +1 
is equal to the probability to contribute —1 to the expec- 
tation value. The particle represents a classical statistical 
ensemble of particles that are either in a definite quantum 
‘up’ or a definite quantum ‘down’ state. 


(11.1.22) 


p= = 0 +D + 1 


= F+) (11.1.24) 
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Quantum entropy 


The introduction of the density matrix allowed Von Neu- 
mann to extend the fundamental concept of entropy to the 
quantum domain. He defined the entropy of a quantum 
state in analogy with the Gibbs entropy for a classical en- 
semble as 


S(p) = —tr plogep=—)}_pilogpi. (11.1.25) 


Where the right-hand side directly follows from the defini- 
tion (Il.1.19). The entropy of a quantum state provides a 
quantitative measure of ‘how mixed’ a system is because 
the entropy of a pure state is equal to zero, whereas the 
entropy of a mixed state is always greater than zero. Let 
us check this with the examples of the previous subsec- 
tion. In the cases with pure states we have first consid- 
ered (Il.1.22) where p(+1) = 1 and p(—1) = 0 which for 
the quantum entropy yields Spure = —1 log 1 — 0log 0 = 
—1-0=0. This reflects that if you know the pure state a 
system is in, you know everything there is to know about 
it, and therefore there is no hidden information and the en- 
tropy should be zero and happily that is true. For the prop- 
erly mixed case of (Il.1.24) we found p(+1) = p(—1) = 
1/2, and we obtain that Smixea = —2- log 4 = log2, 
which corresponds to the information of one bit. And it is 
here that we make contact with the definition by Shannon 
of information being proportional to the entropy as defined 
by Boltzmann and Gibbs. We see quite generally that a 
mixed state corresponding to an equal probability distribu- 
tion over the pure states one has p = >a lil, which 
will have the maximal entropy S = log N corresponding 
to the good old Boltzmann formula. All this underscores 
the remark that a (properly) mixed state is just a classical 
distribution over quantum states. 


Entanglement entropy 


We just saw how the Von Neuman entropy yields a quan- 
titative measure of ‘how mixed’ a quantum state is. The 
entropy of a pure state (that may be entangled or not) is al- 
ways equal to zero, whereas the entropy of a mixed state 
is always greater than zero. So, why inventing the term 
entanglement entropy if the entropy of an entangled state 
is always zero? The logic is somewhat oblique in that the 
term in fact refers to the entropy of a mixed state which 
is obtained after one traces out ‘part’ of the density matrix 
of an entangled state. For this reason such states are re- 
ferred to as improper mixtures, in contrast to the proper 
mixtures which refer to the cases we discussed before 
where the state is a statistical mixture of pure states. 


Partial traces and improper mixtures. In certain situ- 
ations there is indeed a close relationship between en- 
tangled and mixed states, and that is what | would like 
to explain next. It entails a mechanism that plays a vital 
role in explaining the all-important fact that the world we 
perceive is classical rather than quantum, and this expla- 
nation involves the phenomena of decoherence that we'll 
get into shortly. The crucial observation is that an entan- 
gled but pure state in some higher-dimensional multi-qubit 
space can appear to be a mixed state when looked at from 
the point of view of a lower-dimensional subspace. Such 
mixed states that may appear when restricting the density 
matrix to a subspace by (partially) tracing out the other part 
are referred to as improper mixtures, and these are clearly 
essentially quantum because they derive directly from a 
pure (though entangled) state of the system. 


Take a situation where we only koot at part of the system. 
It might be that we can only measure certain qubits and 
not others and without being aware of it. This is frequently 
the case because systems interact continuously with their 
environment. Studying the quantum behavior of a system, 
requires extraordinary precautions to make sure 
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A qubit named Botzilla. Once upon 
a NI a time there were two qubits who had 
Wee nothing to do with each other and 
therefore the two were in a separable 
state, say |+)gp & |1),. We have a 
two-qubit system where qubit Botzilla is our 
object of study, while qubit Abigail is the girl 
out there who wants to get entangled (a form of 
quantum common-law marriage, so to speak) with 
our beloved Botzilla. Abigail, having studied 
equation (II.1.17), decided to bring in her charming 
friend CNOT to make it happen. If you lead us 
through the Gate, eternal gratitude will be yours! 
And this is what happened. Both of them, not so 
young, not-lovers really, went through the Gate 
anyway, and came out entangled indeed. As you 
may have anticipated, they did not live a long and 
happy life ever after. Abigail turned out to be a 
Botwoman and managed to one day disappear 
from the air, leaving Botzilla behind in a severely 
mixed-up state (basically making him the classical 
example of a quantum divorcee). 


To understand the deplorable state Botzilla finds 
himself in, we have to perform what is called a par- 
tial trace in the quantum jargon. The point is that he 
can make only observations which concern himself, 
though, whether he wants or not, he is still entan- 
gled with Botwoman Abigail. This means that he 
only has a small subset of observables to his dis- 
posal of the type 6 © 1 e O. So calculating the 
expectation value of such an observable involves 
tracing over the Abigail qubit. This amounts to just 
establishing the fact that Abigail is still there’ with 
unit probability, yet because the state is entangled, 
the effect of her ‘being somewhere’ is non-trivial. it 
leads to a result which can be described by saying 
that Botzilla is calculating the expectation value of 


just the operator B in his own system, but in a par- 
ticular mixed state. So, he may ignore Abigail, but 
then has to pay the price of being in a mixed state. 
Let us now ‘trace out’ Abigail and see what trace 
she left on Botzilla’s state. This is achieved by 
summing over all the states associated with the 
subspaces we want to ignore, or better, about 
which we know nothing in particular. This means 
that we have to add up the diagonal entries with 
Abigail indices. We know already that Botzilla 
and Abigail ended up in the entangled state of 
equation (II.1.15), which we have to trace with re- 
spect to the second (Abigail) qubit. This we do by 
making use of the fact that tr(1}p)(b|) = (blo). 
Using labels A and B to keep the qubits apart, and 
remembering that because we are using orthogonal 
basis states the calculation can be written like, 


Pp = tra (Wea) (Weal) 

= Stra [Cla +1- 1)sl— 1)a) 
((—1la(—1ls + (a (IIB) | 

(e (1B (111)a +1 — 1)B(—1B(—1] — 1a) 


=f 


7 
= 5 e E= Deme) : (11.1.26) 
This is the density matrix for Botzilla in a mixed 
state with probability 1/2 to either be spinning up or 
spinning down. The corresponding entropy is also 
higher: In base-two S = —log(1/2) = 1 bit, while 
for the original pure state S = log 1 = 0. The whole 
operation of tracing out Abigail is non-unitary and 
irreversible, as we moved from two qubits to one. 
Indeed, exactly one bit of information got lost to the 
environment (it was taken along by Abigail). In 
fact we could of course also calculate the entropy 
for the state Abigail finds herself in, then we have 
to trace over Botzilla’s states. The situation is 
entirely symmetric, and her entropy will also be 1 
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bit. So there is some justice after all! This is what 
happens, the system Botzilla + Abigailis ina 
pure state all along and the total entropy remains 
zero therefore, but looking at subsystems this is no 
longer true. What remains true is that if we divide 
the system up into two complementary parts, the 
entropy in each of them increases equally. 


Generally it is the case that if we begin with a 
statistically pure separable state and perform a 
partial trace we will still have a pure state, but if we 
begin with an entangled state, and we perform a 
partial trace, we will get a mixed state as we just 
saw. In the former case the entropy remains zero, 
and in the latter case it increases. It is precisely in 
this sense that the Von Neumann entropy yields a 
useful measure of entanglement. 


This observation is relevant for the understanding 
of real quantum systems, because most realistic 
quantum systems are strongly entangled with their 
environment. We don’t know exactly how and with 
what, but it means that we tacitly trace out all kinds 
of things we are not aware of. What we know is 
that these systems behave quite classically in the 
end, and that in fact we should not be too surprised 
about that because they are in a strongly mixed 
state. LI 


that it does not engage in interactions that we have no con- 
trol over. Such ‘unknown knowns’ might well wash away 
the quantum effects we were looking for. Quantum effects 
depend on the subtle phase relations that make quantum 
states in fact highly coherent. What to do if part of our 
system is out of sight? It boils down to a quantum, yet 
touching variant of the Romeo and Juliet story called ‘A 
qubit named Boizilla’ 


Event horizons revisited. The Botzilla tale we have 
just worked through may have reminded you of the black 
hole information paradox,which we addressed in the sec- 
tion on black holes in Chapter 1.3 on page 139. We know 
that the Hawking-Bekenstein analysis leads to a macro- 
scopic black hole entropy and temperature of the horizon. 
And we discussed that this is a property that can be as- 
signed in a frame of reference where an event horizon is 
perceived. Our discussion of quantum entropy clearly al- 
lows for a microscopic mechanism, generating the entropy. 
We imagine the creation of a particle-antiparticle pair in a 
pure maximally entangled state, where one of the two par- 
ticles falls through the horizon. This means that the Hilbert 
space factor corresponding to the lost particle gets traced 
out, which in turn tells us that the left-over particle finds 
itself in a mixed (maximal) entropy state. Very much like 
the Botzilla story. The entropy is the quantum entropy 
that arises because we are forced to take a partial trace. | 
have to admit that whether and how this perspective would 
fit into the ‘quantum gravity’ scenarios is still under serious 
debate. 


Decoherence 


Decoherence is the effect that a quantum system in a pure 
state loses its quantum coherence due to interaction with a 
complicated environment. It is one of the reasons why the 
world around us obeys the laws of classical physics. 


Of course a quantum system may be in a pure state but if 
we do not take care it may quickly, through random inter- 
actions with the environment, end up in a mixed state. It is 
basically in a classical state where there are no quantum 
interference effects left and probabilities add, not quantum 
amplitudes. The quantum state ‘decoheres’. 


If we talk about qubit systems, then a way to think of these 
interactions is of course to think of gates that effect the 
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state and thereby cause decoherence. For example we 
may have a qubit in the |+) state, and have it interact with 
some phase gates, like a photon going through a random 
sequence of phase plates. The action of the phase-gate 
P,(8) corresponds to the unitary operator: 


(8) > Ce) (os): 


Let us see what happens to the corresponding density ma- 
trix: 


Vay Al fe = 
= 5(; 1) 018) =P: po P= ( or 


Next we randomize the phases with some normal distri- 
bution as to represent ‘the environment’. This means that 
we choose the density of dephasing agents to be Gaus- 
sian 


f(0) = eA, (11.1.27) 


and then the effect of the random sequence of gates is 
obtained by averaging the above expression: 


2i 1/1 eM) 1/1 0 
=| ¢@)0(6)40 = = 
=| pele z (eva 1 Eeit J 


What this calculation shows is that only the classical prob- 
abilities on the diagonal are left and the off-diagonal phase 
coherence of the quantum state pọ has disappeared. By 
choosing A large enough we wash out all quantum corre- 
lations and end up with a classical distribution over up and 
down states. This calculation merely illustrates a mecha- 
nism that leads to decoherence. Clearly, one would like to 
actually compute also the time-scales over which this de- 
coherence takes place, this depends of course on the de- 
tails of the environment or measurement apparatus. 


Let us close this section by another toy model of decoher- 
ence. We start with a separable two-qubit state which we 
entangle using the CNOT gate as we did in (11.1.17). Then 
we use the Botzilla — —Abigail mechanism by taking 


the partial trace with respect to Abigail as in (II.1.26) end- 
ing up with the mixed state for Boizilla. This basically turns 
the story into a decoherence phenomenon. 


In other words, we imagine an interaction of a qubit B 
in a state |g) with the environment (a qubit A in some 
state |p, )) to generate an entangled two-qubit state |p) = 
pa) from a separable two-qubit state |p) = |g) Spa). 
When viewed from the perspective of a single qubit, the 
resulting state after tracing out the A qubit, becomes inco- 
herent. That is, suppose we look at (II.1.17) in the density 
matrix representation. Looking at the first qubit only, the 
state vector of the separable state is |g) = |+), a pure 
state in the density matrix representation given by equa- 
tion (1.1.23), 


nbs) (sl =e 5 ( 7 ; 


Under the action of CNOT this becomes the maximally en- 
tangled state on the right-hand side of equation (II.1.17). 
After partially tracing the density matrix as in (II.1.26 ) we 
end up with the B qubit in a mixed state given by (Il.1.24), 


Only the ’classical’ probabilities on the diagonal are left 
and the off-diagonal phase coherence of the quantum state 
has disappeared due to entangling a degree of freedom in 
the environment. 
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Table II.1.1: Key quantum principles concerning the Hilbert space of quantum states introduced in Chapter II.1. 


Keyword 


Description 


(ii) 


(vii) 


(viii) 


(ix) 


Hilbert space 


State vector 


Expansion of state in a basis 


Probabilistic interpretation 


Qubit state 


Conjugate states 


Bracket or inner product 


Multi-particle or qubit states 


Entangled and separable states 


Mixed states 


Density matrix/operator 


Quantum entropy 


The complex vector space denoted by of states of a quantum states. In this chapter 
we restrict ourselves to the finite dimensional case. We refer to the Math Excursion on 
complex vectors and matrices on page 632 of Volume III 


A pure quantum state is denoted by hp) corresponding to a ket or column vector in H . 


Any state |i») has a linear expansion in the basis {|i)} given by hp) = >, ali) , with 
normalization condition >, |oi|? =1. 


A measurement on the above state |W) of the ‘property’ related to a basis {|i)} gives 
outcome i with probability |a;|? . 


A qubit state is a two-dimensional complex vector: |) = o|1) + B| — 1) with normal- 
ization |x|? + |B|? =1. A realization is a spin 1/2 degree of freedom where the vector 
is called a spinor. 


The complex conjugate or dual state of |) is defined by the bra or row vector (p| = 


Di ati. 


For two states |) and |) with coefficients «i and Bi respectively, we define the inner 
product as the bracket (®|Y) = }_; B¥a;. The orthonormal frame satisfies (ilj) = ôij . 
(|Y) is a complex number that satisfies (O/V) = (W|@)*. 


If particle one has a state that is m-dimensional and that particle two is n-dimensional, 
than the two-particle system has a (m x n)-dimensional state vector, which can be 
expanded as |¥(12)) = ee yyl) @ i). A two-qubit state vector is 27 = 4- 
dimensional, written as: |W) = o1|1,1) + «2/1, —1) + «3| — 1,1) + x4] —1,—1), with 


lij) = li) 8 lj) = lh). 


A n-particle state is separable if it the state can be factorized in an n-fold product: 
WAM) = hp \ppl2))... hp™) . A state is entangled if it is not separable. 


A mixed state is a properly normalized (statistical) mixture of some set {|pa})} pure 
states: |Y) =), pala), with probability pa that the system is in the pure state |pa) . 


The density operator for a mixed state |Y) is defined as p = >; Pa lWa)(Wal Fora 
pure state there is only one term p = 1. 


The quantum entropy of a mixed state is given by: S(p) = 
—} „Pa log pa. For a pure state the entropy is zero. 


—trplogp = 


Chapter II.2 


Observables, measurements and uncertainty 


It is wrong to think of that past [ascribed to a quan- 
tum phenomenon] as ‘already existing’ in all de- 
tail. The past is theory. The past has no existence 
except as it is recorded in the present. By decid- 
ing what questions our quantum registering equip- 
ment shall put in the present we have an undeni- 
able choice in what we have the right to say about 
the past. 
John Archibald Wheeler, 
Some Strangeness in Proportion (1980) 


In the previous chapter we focussed exclusively on states, 
in particular the space of pure quantum states, the Hilbert 
space H. In this chapter we consider the physical vari- 
ables or quantum observables. These are represented by 
linear operators or matrices which act on the Hilbert space. 
The fact that physical variables are no longer represented 
by ordinary numbers or functions like in classical physics, 
but by matrices or differential operators makes quantum 
theory fundamentally different. It leads to deep reflections 
on the logical structure of the theory, on the nature of mea- 
surements, and on the fundamental aspects of uncertainty 
so concisely expressed by the Heisenberg uncertainty re- 
lations. And that is what this chapter is about. It should 
make you feel at home in Hilbert space. 

We have summarized and specified the basic ingredients 
of the mathematical framework and the jargon that comes 
along with the notion of observables, which forms the sub- 


ject of this chapter, in the table at the end of the chapter 
on page 321. 


Quantum observables are operators 


Physical variables or observables in quantum theory are 
represented by hermitian operators. In this section we ex- 
plore what this means in general and work out most of the 
details for the case of qubits. Operators have a spectrum 
of eigenvalues that correspond to possible measurement 
outcomes. To these eigenvalues correspond orthogonal 
eigenstates (or subspaces), which can be used to define a 
suitable frame for the Hilbert space. The aim of this sec- 
tion is to exhibit the algebraic structure of the theory, with 
the observables, projection operators and raising and low- 
ering operators which play essential roles in describing the 
generic properties of quantum systems. 


The algebra of observables. For a quantum system we 
have a set of dynamical variables called observables, O = 
{A, B,...}. In most cases corresponding to the classical 
variables, but there may be additional variables such as 
the aforementioned spin, which have no classical ana- 
logue. Whereas in classical physics the language of states 
and dynamical variables is smoothly connected, basically 
because the states are labelled by the (real) values of the 
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dynamical values. This however is no longer true in the 
quantum world. In quantum theory we make a clear dis- 
tinction between the Hilbert space H of states and the set 
of observables ©. Let us start with some general proper- 
ties and definitions. 


1. Operators on Hilbert space. The quantum observables 
are represented by linear operators, that act on Hilbert 
space.! In other words we have that O : H — H, and 
we write: 


hb’) =A hp), 


with Y), hp) E€ HandA ec O. 

You should typically think of matrices in case the Hilbert 
space is finite dimensional.? In the infinite-dimensional 
case, we should think of continuous systems like a particle, 
where the states are described by wavefunctions p(x, t), 
and the operators are typically represented by a differential 
operator, like the momentum and energy operators: 


d d 
= —ih— H = ih— 
P ix and n? 


as we mentioned in the previous chapter. 

The fact that observables are operators that ‘act on states’ 
implies that they may well change the physical state, and 
strongly suggests the possibility that the act of measure- 
ment of such an observable will affect the state of the sys- 
tem. 


2. Linearity. Linearity implies that for any two states and 
any observable A we have that, 


A (hpi) + hp2)) = Abr) + A fz). 


3. Hermitian adjoint. On the algebra © we can define 


1n this book we adopt the convention to represent quantum observ- 
ables with uppercase letters while for their values we use lowercase. 
The set {a} of allowed values is called the sample space of the observ- 
able A. 

2We refer to the Math Excursion on page 614 of Volume III for an 
introduction to real matrices and vectors, which was extended to the 
complex case in the Math Excursion on page 632. 


a hermitian adjoint, or ‘dagger’ operation, denoted as 7, 
where A — At . The definition is as follows 


(PIATh) = (PIAhp)* forall |), hp) EH. 


Sandwiching the adjoint operator AŤ between any pair of 
states yields a number, which is the complex conjugate 
of the number resulting from sandwiching A. From the 
definition it follows that (i) the adjoint of a product satis- 
fies (AB)' = BTA, and (ii) the dagger squares to unity: 
(At)* = A, and is therefore referred to as an involutive 
automorphism of the algebra of observables. For matrices 
this implies that the hermitian adjoint of A is defined as 
Ai = (A‘)*, or in words: it is the complex conjugate of 
the transpose of A. 


(11.2.1) 


4. Hermitian or self-adjoint operators. We require that the 
eigenvalues of an observable are real numbers, as they 
correspond to possible outcomes of measurements, and 
that translates into conditions on the particular type of ma- 
trices that can represent physical observables. As a matter 
of fact the reality condition on the eigenvalues of operators 
requires that the quantum observables have to correspond 
to hermitian also known as self-adjoint operators or ma- 
trices. This means that observables satisfy the condition 
A = AÏ. A general hermitian matrix is a matrix M with 
complex entries that can be written as M = S +iA , where 
S is real and symmetric, and A is real and antisymmetric. 
For the case of a two-dimensional Hilbert space, like in 
the case of a single qubit or a basic quantum spin, all ob- 
servables can be expressed as a linear combination of the 
unit matrix and the three Pauli or spin matrices of equation 
(11.2.2). 


5. Norm and boundedness. We like to talk about bounded 
operators A , meaning that if they work on vectors in Hil- 
bert space they do decent things. So what sets the norm 
||[A|| for an operator? Here is a reasonable way to do 
this: (i) you let A work on all states in H , (ii) calculate 
the norms of all the resulting vectors, and (iii) look at the 
‘largest value’ or ‘infimum’ that occurs, which is denoted 
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by inf. So, the definition of the norm of the operator A is 
then: 
|All? = inf{ (pIATAhp) : Whp € H}. 


A bounded operator has by definition a finite norm: ||A]| < 
oo. If you think of A as a matrix, this statement boils down 
to saying that the eigenvalues of the matrix should be fi- 
nite. 


6. Algebraic structure. The observables form an algebra 
(we want to add and multiply observables). This is easy 
to understand for matrices as we will show in the Math 
Excursions just mentioned. The restrictions (of bounded- 
ness and self-adjointness) are much harder to implement if 
one passes to the infinite-dimensional cases correspond- 
ing to physical systems like particles and fields which have 
continuous variables. To properly address these problems 
one needs some quite sophisticated mathematics involv- 
ing concepts like Banach spaces and C* (‘C-star’) alge- 
bras. This allows for a mathematically rigorous and con- 
sistent formulation of quantum theory. Such axiomatic ap- 
proaches, however, are far beyond the scope of this book, 
though one may of course argue that they are quantessen- 
tial because they address foundational questions. We will 
follow an operational, less rigorous approach, and com- 
fortingly, it turns out that the typical notation we have in- 
troduced doesn’t change much after going rigorous. We 
will treat the expressions using simple rules, glossing over 
the fact that we manipulate symbols which deep down may 
refer to rather sophisticated notions. 


The qubit observables. The Hilbert space for a qubit 
is two-dimensional, and therefore the observables can be 
represented by 2 x 2 hermitian matrices. A typical set of 
observables would be the set of so-called Pauli matrices 
{X, Y, Z} with: 


0 1 0 —i 1 0 
x= (0 a e 9). aaa 


Any one qubit observable can be expressed as a linear 


combination of the three Pauli matrices and the unit ma- 
trix.’ 


In our discussion of classical bit mechanics we already ar- 
gued that the X matrix, as operator or gate, acts like a 
momentum or displacement operator on the z-space of the 
bit, because it acts like the NOT-gate interchanging the two 
bit states |1) + | — 1) . It shows nicely how classical phys- 
ics (discrete mechanics), and now quantum theory meet 
in this picture, with a correspondence between dynami- 
cal maps, logical (digital) gates, and quantum observables: 
they are all operators acting on a state. 


q-gates. Clearly the three Pauli spin matrices above are 
one-qubit gates. In classical computation the X-gate corre- 
sponds to the NOT-gate, and is the only acceptable one-bit 
gate. The others are not, because the Z-gate introduces 
a relative minus sign (which is a phase), and the Y-gate 
introduces complex components, which are both not ad- 
missible for classical bits. This is a first hint that quantum 
bits offer far more possibilities, so let us get back to the 
qubit observables. 


Sample spaces and preferred states 


To each observable A corresponds a set Sa = {aj} of val- 
ues it can take. In other words, it is the set of possible 
outcomes of a measurement of the observable A , which 
is also called the spectrum or sample space of A. If we 
apply the observable A to a state |p} and we get a num- 
ber a; multiplying that same state, we say that the sys- 
tem is in a state where A takes the value a;. A state with 
this property is denoted as |p) = |a;), and is called a 
preferred or eigenstate (or eigenvector) of A with eigen- 


3The real spin polarization operator has units and equals S, = 
IRZ, involving an essential factor one half. Throughout the book we 
discuss spin one-half directly in terms of the Pauli matrices {X, Y, Z}, 
which in most textbooks are denoted as (0x, Oy, Oz) . 
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value a,. These statements are summarized by the fol- 
lowing equation, 


Alai) = ailai) $ (11.2.3) 


Is the eigenvector defined this way unique? No, it is not, 
we can multiply by any overall constant and it is still an 
eigenvector. We take care of that by choosing the eigen- 
vector to have unit length, but then there is still an overall 
phase factor (et?) possible. This factor doesn’t have any 
observable consequences. 


Qubit eigenstates. Recall that for the classical dynamical 
bit we introduced a position z = +1 and a momentum 
p = 1. In the quantum realm these observables should 
somehow correspond to certain operators. Let us thereto 
consider the 2 x 2 matrices Z and X (related to p) which 
can act on the states in H . 


The basis vectors corresponding to the classical states are 
indeed eigenvectors of the position operator Z: 


i *) fo) = 160) 
Geo EN 


and the eigenvalues z+ = 1 are the corresponding z 
values. We conclude that the sample space or spectrum 
of the observable Z is S, = {+1}. 


N N 
ATEN CAT N 
wO on 
a” we 

| |l 


The operator X does also exactly what you would expect 
of the ‘momentum’ operator; it implements the p = 1 tran- 
sition | + 1) & | F 1 > as one may verify explicitly: 


1 0 1 ] 0 
«(= 0) a 
We also learn that the operator X? equals the unit matrix. 
In fact we have that X? = Y? = Z? = 1, which by defini- 


tion leaves all states invariant and it therefore implements 
the trivial p = 0 transition. This is as far as the ‘relation’ 


between classical and quantum formalism can be traced. 


The quantum formalism allows for more because we have 
the linear superposition principle as well as the complexi- 
fication of the state vectors. We have seen that the states 
| + 1) correspond to the eigenvectors of the ‘position’ op- 
erator Z, but in the quantum formalism we can also ask 
for the eigenvectors of other observables, for example X. 
One easily verifies that these correspond to the state vec- 
tors |+) = (| + 1) + | —1))/V2, with again eigenvalues 


x+ = +1 as follows: 
1 
| ; 
io) 


x(e)-G o) (4) 


The eigenvectors |+) are real linear superpositions of the 
basis states |+ 1) , and we have marked them on the circle 
of real states in Figure II.1.7. 


Is this all? Are we done? The answer is, no! We have 
indeed identified the eigenstates of momentum, which ac- 
tually do not have a classical equivalent. This shows the 
quantessential possibility that the linear superposition prin- 
ciple introduces. However, we have so far only explored 
real states and real matrices, and it is here that the quan- 
tum formalism summons us to proceed. There are other 
independent choices: the one conventionally chosen is the 
(complex) matrix Y: 


One may verify that Y = Y' , and we see that acting on 
the basis states it indeed introduces complex coefficients 
as 


yin 


ilF 1l). 


So loosely speaking we could say that Y introduces a com- 
plex part to the standard classical momentum variable P ~ 
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X. We should expect its eigenstates to be complex as 


EPON 


and the eigenvalues are again yz = +1. 


The fact that all eigenvalues square to one is not surpris- 
ing if one realizes that the matrices themselves square to 
the unit matrix: Z? = P? = Y? = 1. All quantum observ- 
ables in this problem can be written as linear combina- 
tions of the independent hermitian matrices X, Y, Z and 
the unit matrix 1. These basic observables have identi- 
cal sample spaces Sx = Sy = S, = {+1,—1}. Further- 
more, as we just showed, they have no eigenvectors in 
common. It signals the important fact that these three ob- 
servables are incompatible with each other, a notion we 
will return to later on. It raises the question of what that 
means in terms of measuring these observables in such a 
non eigenstate. 


Expectation values. We may now also define the notion 
of the expectation value of an observable A in a quantum 
state |p) as: 

a =< A >= (~|Aly)), (11.2.5) 


which is just a number indeed. The expectation value a 
is therefore a weighted average of the eigenvalues of A , 
which depends on which state hp} one chooses. This is 
consistent with the remark we made earlier that the square 
of the coefficients are probabilities. It means that we ‘sand- 
wich’ the operator between a row and column vector, for 
example: 


azien=ao(! 8) (Y =)=, 


and similarly: 
(+1|X | +1) = (41|-—1) =0. 


An expectation value can be calculated for any observable 
in any state and corresponds to some average of measure- 
ment outcomes. 


A Qubit is like a Barbie on a globe 


We return to the qubit state space 
and point out an alternative way to 
s parametrize qubit space by directly re- 
lating it to the eigenstates of an op- 
erator/observable. This amounts to 
yet another geometrical representation of the state 
space of a qubit or quantum spin, and that will be 
useful in a variety of contexts. We start by choosing 
a point on the unit two-sphere in X, Y, Z space, as 
depicted in the figure (a). The point represents a 
unit vector ft, but we want to use it to label a qubit 
state, which as we saw is a point on a unit three- 
sphere so we have to do a little more. First we con- 
struct a unit sigma matrix fì - o, with o = {X, Y, Z}, 
which to each point on the sphere associates a 
particular hermitian (2x 2) matrix or observable. 
This observable is proportional to the spin opera- 
tor along that axis. The qubit state that we link to 
the point is the eigenvector |x;) of that observable 
with the highest eigenvalue (A; = 1). However the 
eigenvector is not unique: it is multiplied by a phase 
factor with some angle o between zero and 271, so 
to completely fix the state we have to specify the 
pair {ft, b}. 
The mathematically alert reader may have experi- 
enced a feeling of déja vu since | am basically re- 
peating the story | told in Chapter 1.1 about the Hopf 
or monopole bundle, where the three-sphere was 
interpreted as a phase or circle bundle over the two- 
sphere. So the three-sphere is a physically relevant 
object, we have seen it appear as the bundle as- 
sociated with the fundamental Dirac monopole in 
Chapter!l.1, as the manifold of the group SU(2) in 
the Math Excursion in Volume Ill on Groups, and 
here as the state space of a qubit. 
The natural way to represent also the angle ọ 
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in the picture is to draw the tangent plane to the 
sphere at the point chosen, and define ¢ as the 
polar angle in that tangent plane as we did in the 
Math Excursion on Complex numbers on page 607 
of Volume III. 


(a): Choosing a state of a qubit is like setting a Barbie 
on a globe.. Choosing a different frame is like choosing 
the North and South Poles along a different axis. 


In the figure we have depicted some of the states 
we discussed before, on the z-axis we have two 
points with the operators +Z with eigenvectors 
| + 1). So the states are now represented as unit 
vectors in the tangent plane at the point fi with 
phase angle p. So in the plane at the North Pole 
we find the states +|1) at angles @ = 0,7. 


What we have learned is that we can represent 
a point on the three-sphere by choosing a point 
on the two-sphere and an additional phase in that 
point. This way of choosing coordinates on the 
three-sphere is indeed completely equivalent to 
fixing a Barbie on the earth surface by saying 
where (s)he stands, and in what direction (s)he 
is looking. In a more sophisticated wording one 
says one picks a point on the sphere and a frame 


in the tangent plane to the sphere at that point as 
is illustrated in the figure. So now you don’t any 
longer have to say that you cannot imagine how to 
choose a point on a three-sphere, even a kid can 
do it! Buy him a Barbie of some sort and a globe 
and ask him to stick the Barbie on the Globe. 


Note that the present picture (a) is essentially 
different from Figure II.2.2 in that correspond- 
ing states are located in different places. For 
example the North Pole represents the states 
& |l} = exp(id)|1) , where p is the angle of the 
arrow in the tangent plane. 


This set contains in particular the real states |1) for 
p = 0 and —|1) for p = 7, whereas the states 
+| — 1) are located on the South Pole. In Fig- 
ure Il.2.2 the states | + 1) are perpendicular, in Fig- 
ure I|.2(a) they are antipodal. Changing the qubit 
state corresponds to moving around on the three- 
sphere and that is nothing but walking over the 
globe and looking in various directions. What is all 
this good for? This alternative view of the space 
of states of a qubit or quantum spin has yielded 
some interesting physical insights to be addressed 
in Chapter II.3 about probing the state space and 
measuring the Berry phase, which is exactly like 
having the Barbie in the figure walking around on 
the globe. LI 


Spin or qubit Hamiltonians 


A crucial observable in physics is the energy or the Ha- 
miltonian operator denoted by H. The eigenvalues En of 
the Hamiltonian correspond to the allowed energy levels 
of the system. The possible energy eigenstates |1p,,) are 
called stationary states, because they have a trivial time 
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dependence that resides in the overall phase factor. Linear 
combinations of different energy eigenstates would there- 
fore have a non-trivial time dependence. Of particular in- 
terest is the lowest energy state or ground state of the 
system. We consider two examples for the Hamiltonian 
of a spin or qubit and show their properties. Our first 
choice corresponds to putting the spin in a magnetic field 
in the z-direction, the Hamiltonian would be proportional to 
Z: 
Hı = bZ, 


and its eigenstates are | + 1) and have eigenvalues à+ = 
+b . Another sensible choice for the Hamiltonian would be 
what is called the total spin operator which is quadratic in 
the spins: 


Hy = b(X? + Y? + Z?) = b(1 +1 +1) = 3b1, 
Indeed a bit trivial perhaps, because it is just 3 times the 
unit matrix. Of course, if we act with this Hamiltonian on 
any state, it will return that state with eigenvalue 3b, i.e. 
H2|t)) = 3bhp) . In this case you could say that the Hamil- 
tonian is trivial, because all states have the same eigen- 
value, they are what we call degenerate. Degeneracies 
are acommon feature and usually imply that there is some 
(hidden) symmetry in the system one considers. 


Frames and observables 


The eigenstates |a,) of a linear operator A are defined by 
the equation Ala,) = axlax). If Ais a N x N hermitian 
(matrix) operator, there are N independent (N-dimensional) 
eigenvectors and the eigenvalues a, are real and gener- 
ically different. As we will see these eigenvalues are the 
possible outcomes of a measurement of that observable. 
Generally the eigenstates can be chosen orthonormal, so 
that 


(ajlax} = 5; where bij =] ifi=j, and bij =OifiFj. 
(11.2.6) 


1> 
|-> |+> 
-|-1> [= 
-|+> -|-> 
-|1> 
Figure II.2.1: Two frames. Two different frames spanning the 


same two-dimensional space of real qubit states. The blue one 
is the Z frame {| — 1),|1)} and the green one is the X frame 
{|+),|-)}. The frames are related by a rotation over an angle 
0 = 45°. 


This means that the set {|a;)} forms an orthonormal ba- 
sis or orthonormal frame for the state space — the Hilbert 
space — of the system. 


Qubit frames. Let us briefly illustrate this: the eigenstates 
for A = Z are the column vectors 


me (4), -rä a 


which have eigenvalues plus and minus one respectively. 
The eigenstates | + 1) of Z form an orthonormal basis for 
the space of qubit states. 


If we choose instead A = X , then the normalized eigen- 
states correspond to 


o tee 1/1 
is) =) 1-1) (a) 


and these have eigenvalues +1 also. Clearly, the states 
|+) form an alternative basis for the qubit states. In Fig- 
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Figure 1.2.2: Frames and eigenvalues. The frames corre- 
sponding to eigenstates of Z (blue) and X (green) respectively. 
The axes are labeled by the corresponding eigenvalues. The 
circle represents the normalized qubit states with real « and p. 


ure II.2.2 we have depicted the two frames where the unit 
circle describes all the states with real coefficients « and 
B . This picture will return in many guises when we discuss 
measurements in quantum mechanics. A priori there is 
no preference for any particular basis, the best choice de- 
pends on the questions you want to answer. Clearly if we 
are going to measure some physical quantity, the eigen- 
states of the corresponding operator will play an important 
role. 


What the examples just given also show is that the Z and 
X operators have no eigenvectors in common. That is nec- 
essarily the case because the operators do not commute, 
and they are called incompatible observables. We return 
to this notion in a forthcoming section. 


Frame choices. When writing down an explicit expres- 
sion for a qubit, or in fact for any quantum system, we first 
have to choose a basis {|i)} in which the state can be ex- 
panded. This basis is a matter of choice. In Figure ll.2.1 


Figure II.2.3: Frame rotations. Two frames spanning the space 
IR? . The ‘rabbit’ and the ‘pig’ frames can be rotated into each 
other. For example first rotate the z’-axis to the z-axis, then the 
x—,x’—, y— and y’-axes all lie in the x — y plane. So there they 
can then be rotated into each other by a rotation around the z- 
axis. This also holds for frames in higher dimensions because 
rotations preserve the origin, the length of vectors and also the 
angles between them. This in fact defines what a rotation is. 


we have for example depicted the standard blue frame, but 
also a different green frame consisting of the states |+) 
and |—) . In Figure II.2.3 we have depicted two frames for 
a three-dimensional vector space. What is quite evident 
from the figures is that different frames can be transformed 
into each other by a simple rotation. That is so because ro- 
tations by definition not only keep the length of vectors but 
also the angles between them the same.* 


tA rotation in fact preserves the orientation of a frame. If we in- 
terchange the x- and y-axes in Figure II.2.3, then we also have an or- 
thonormal frame, but it cannot be obtained by rotating the old frame, 
exactly because its orientation is opposite. The frames in the figure 
are right-handed meaning to say if rotate from x to y the right-handed 
rotation by the ‘like’-rule would point in the positive z-direction. 
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Unitary transformations 


A rotation of a vector or a frame is an operation or a trans- 
formation on such a vector or frame. You may in this re- 
spect think of a frame as a solid cube, under rotations its 
shape is conserved, it stays congruent. In a N—dimension- 
al space such rotations can be represented by a N x N 
matrix that act on a vector. 


An important property of rotations is that they satisfy the 
group property, namely that the result of two successive 
rotations is again a rotation. This is obvious in the two- 
dimensional case because you just add the angles. In 
three dimensions a simple way to see it is to look at the 
‘rabbit’ unit vector in the z—direction in Figure II.2.3. If we 
would trace out the arrow head under all possible rota- 
tions, we should get the unit sphere. Any rotation of a 
vector around an orthogonal axis would move the arrow- 
head along a big circle over the sphere, big because it is a 
circle of maximal size on a given sphere. It is also true that 
the shortest distance between two points on the sphere 
is exactly the unique segment of the unique big circle on 
which both points lie. So, if we make first a rotation of the 
vector around some axis fi; , the vector moves from the 
first point A over a segment of some big circle to a second 
point B. Next we move the resulting vector over a given 
angle around a second axis fiz, then the vector ends up 
at a third point C on the sphere. The combined rotation is 
then just the rotation that moves the vector from A directly 
to C over the big circle connecting them. 


This is all simple to imagine, and therefore let us now trans- 
late these simple geometric intuitions into a symbolic lan- 
guage. We start with rotating ket vectors with rotation ma- 
trices U;: 


bz) = Uih) 
Wbs) = Ushb2) = U2zUyfh1) = Uhh) 
& Rie big (11.2.7) 


This is true for arbitrary vectors and also for arbitrary rota- 
tions. Under a frame rotation U, the conjugate bra vector 
will rotate like: 


(bol = l UY, 


with the conjugated rotation matrix UÏ, that can be ob- 
tained from U by interchanging rows and columns (which 
is called taking its transpose U*™) and also taking its com- 
plex conjugate (meaning conjugating all its matrix elements 
i.e. its entries, so, Ut = (U'™)* . We require the length and 
inner product of vectors to be preserved under rotations, 
so if we simultaneously rotate arbitrary vectors hp} and |p} 
by U, then we have to impose: 


(ahb2) = (lUt Uppy) = (ihr). 


From the last equality we conclude that rotations appar- 
ently correspond to a unitary transformation, satisfying the 
unitarity condition:° 

utu =1. 


The rotations in N complex dimensions form a mathemati- 
cal structure called a group, basically because they satisfy 
the group property, equation (11.2.7). This group is called 
the unitary group denoted by U(N) . More precisely itis the 
special unitary group SU(N) because the rotations pre- 
serve the orientation of the frame (this is the cyclic order 
X, Y, Z, where by definition R x § = 2). We refer to the 
Math Excursion A for further details. 


Photon gates and wave plates 


One can think of these unitary operations as a transfor- 
mation on the qubit state vector. And changing the state 


5Note that if we rotate in real space the matrices become real and 
there is no complex conjugation, therefore real rotations are orthogonal 
matrices O satisfying the condition that OO = 1 these matrices also 
form a closed group under multiplication, denoted as the orthogonal 
group O(N). Indeed where quantum physicists are married to unitary 
groups, classical physicists are with the orthogonal ones. It is the dif- 
ference between being complex and being real. 
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Iv & |v> 
|h> & -|h> 


l+ & |-> 


|IL>& |R> 


Figure 1I.2.4: Wave plates. The wave plates with optical thick- 
ness of A/2 and A/4 can be used to change the polarization 
state of a photon. They are unitary one-qubit phase gates, and 
are the physical realizations of the transformations U described 
in the text, on the states defined in Figure II.1.11. The transfor- 
mations can be inverted meaning that we reverse the direction 
in the picture, so, if going to the right corresponds to some U, 
then going to the left corresponds to U'. 


vector really amounts to processing quantum information 
as the in-state gets transformed into some out-state. Such 
manipulations can be performed on real photons relatively 
simply by what are called wave plates. These have two pa- 
rameters: a principal axis and a given optical thickness as 
is depicted in Figure II.2.4. We have shown the effect on 
the polarization state of a photon when it passes through a 
phase plate with its principal axis along the z-axis in the fig- 
ure. The plate acts like what is called a phase-gate P (0); it 
leaves the polarization along the principal axis unchanged, 
and rotates the orthogonal component by a phase corre- 
sponding with the optical thickness of the plate. So in the 
case at hand the action is given by, 


(8) > eCe) Cee) 


Indeed the A/2 plate rotates the lower component over an 
angle 8 = 7 in the complex plane leading to the phase —1 , 


while the A/4 plate rotates by an angle © = 7/2 giving an 
imaginary factor i. 


Incompatible observables 


The fact that observables are represented by operators re- 
flects the quantessential property that measurements may 
alter the state, and therefore that the outcomes of different 
measurements may depend on the order in which the mea- 
surements are performed. This latter property expresses 
the fact that the operators that represent observables in 
quantum mechanics do not necessarily commute, by which 
we mean that for the product of two observables A and B 
one may have that AB # BA and we say that such ob- 
servables are incompatible. \t is pretty weird to be told 
that momentum times position would not be equal to 
position times momentum, but that is the way it really 
is if you think of them as operators instead of numbers. 
This is common in the quantum world because matrices 
generically do not commute. For the simple set of qubit 
observables given in equation (II.2.2), you can verify that 
they do not commute with another indeed: for example 
ZX — XZ = 2iY. 


To illustrate this non-commutativity we have in Figure II.2.5 
depicted a sequence of two 90° rotations in opposite or- 
der: on the left we rotate the book first around the z-axis 
and then around the x-axis, and on the right we do it in 
the opposite order. At the bottom one sees that the re- 
sulting orientations of the book clearly differ, meaning that 
for the operations on the state of the book b one has that 
R,R, Æ RxRz . For the case of a particle it turns out that the 
position and momentum observables X and P do not com- 
mute: one finds that XP—PX = ih. This non-commutativity 
of observables has dramatic consequences and lies at the 
root of many of the at first sight inconvenient truths that 
quantum theory revealed about the basic workings of na- 
ture. 
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Figure 11.2.5: Non-commuting rotations. We illustrate non- 
communativity of the 90° clockwise rotations R; and Rx around 
the z- and x-axes respectively. The order in which they are ap- 
plied (to the book) does matter and clearly leads to a different 
final state. 


The labelling of quantum states. Consider a N x N ma- 
trix observable, in the generic case it will have N different 
real eigenvalues, with orthogonal eigenvectors. In general, 
it may happen that two or more eigenvalues coincide, in 
which case there will be more than a single (independent) 
eigenvector corresponding to a given eigenvalue. We say 
that the spectrum of the observable A is degenerate. In 
that case the eigenvalue a; labels not just a particular state 
but rather some subspace Vt of the Hilbert space. In fact 
states can be simultaneous eigenstates of other observ- 
ables. The previously mentioned state may also be an 
eigenvector with value b; for the observable B, and we 
may label that state by the element of the combined sam- 
ple space and write |W’) = |a;, bj,...). 


In general there will be many different sets consisting of 
a maximal number of independent, but compatible observ- 
ables and these can be used to label a particular set of ba- 
sis states (a frame) of the system. Observables A and B 


for which a joint set of eigenstates can be chosen, neces- 
sarily commute and are therefore by definition compatible. 
What makes quantum theory so special is that this is of- 
ten not the case, so that we continuously have to deal with 
observables A and B that are incompatible. For such in- 
compatible observables Heisenberg’s uncertainty relations 
impose quantessential restrictions, to which we will turn 
shorty. 


Quantum setting. We conclude that there are four as- 
pects in which the quantum setting significantly differs from 
the classical one: 

(i) the set of admissible values for a dynamical variable 
may differ, in particular it may be a discrete set in which 
case the values would be quantized whereas in the classi- 
cal case the values would be continuous; 

(ii) a quantum variable may not have a classical analogue 
at all, such as a particle having an intrinsic rotational de- 
gree of freedom called ‘spin’, and most importantly; 

(iii) in a given state of a quantum system generally incom- 
patible observables cannot be simultaneously assigned a 
definite value. The non-zero spread in observed values in 
a given state is then governed by Heisenberg’s uncertainty 
principle to be discussed later; 

(iv) certain classical dynamical variables which involve prod- 
ucts of incompatible variables will not have an unambigu- 
ous or unique quantum analog. There may be ordering 
ambiguities. 


At first it seems inconceivable that such a vile theory has 
become one of the crown jewels of a rigorous science 
like Physics! It is remarkable that a theory can host this 
very anti-intuitive notion of incompatibility without becom- 
ing inconsistent. This notion of incompatibility has pro- 
found repercussions on what this theory can possibly mean 
and these matters will of course be discussed extensively 
in the forthcoming chapters. 
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Projection operators 


Closely related to the notion of the state vector and a basis 
{|i)} is the concept of a projector. A projector is an opera- 
tor P that may act on vectors in a vector space like H and 
it projects the vectors along a particular axis, or in general 
on some subspace of H . By virtue of this defining prop- 
erty applying a projector P twice on any vector gives the 
same result as applying it once: P? = P. Note that 1 — P 
is also a projection operator as it also squares to itself. We 
can rewrite this as P(1 — P) = 0 which amounts to say- 
ing that P and 1 — P project on orthogonal subspaces of 
H. So given a projection operator one can make an or- 
thogonal decomposition of the Hilbert space. On vectors 
in the first subspace the projector act as the unit operator, 
and on the vectors in the orthogonal complement it acts 
like the zero operator. This observation is highly relevant if 
one wants to assign properties to a quantum state. A pro- 
jector P assigns a truth value to a state, but only if the state 
vector sits entirely in the subspace on which P projects, or 
its orthogonal complement. Clearly if the state vector has 
components in both, you cannot say it has the property nor 
can you say that it has not. But in that case there are other 
projection operators that do a better job, because there 
are always subspaces which contain that state vector or to 
which that vector is orthogonal. The notion of projectors 
plays an important role in the theory of quantum measure- 
ment as we will see in the next section. 


Elementary projectors. One easily verifies that the pro- 
jector P; which projects on the axis corresponding to the 
basis vector |j) is given by: 


P; = j) 0l, 


and indeed its square equals itself and applying it to a state 
vector and using (II.2.6) yields: 


(11.2.8) 


Pj [W) = Zio lj) Gli) = yl), 


which is exactly the component along the j-axis, i.e. (j/¥) |j) . 


Note that any sum over a subset of P; is also a projec- 
tion operator (because they mutually commute), and so is 
|W) (Y| for any state |Y). 


Consider ‘bracketing’ an elementary projector in some state: 


pi = (YI PLY) = |G) = loal? , (1.2.9) 


it yields the component along the basis vector squared. 
This is the probability p; of finding the particle in the state 
|i) in an appropriate measurement. The normalization con- 
dition (II.1.9) is nothing but the statement that the total prob- 
ability of finding the system in some state equals one, as it 
should. 


Completeness. One now can also understand that the set 
of elementary projection operators satisfies the so-called 
completeness relation, which amounts to the statement 


that 
Se Walesi; 


This means that it works as the identity operator: acting 
on any state vector hp) it gives back the same state. The 
completeness relation is also referred to as the projective 
decomposition of the identity operator, since it is the oper- 
ator equivalent of the statement that any state vector can 
be decomposed in its components with respect to some 
frame. 


(II.2.10) 


Observables and projectors. From the orthonormality 
relations of eigenvectors {|a;)} of an observable A, and 
the properties of the corresponding elementary projectors 
P; , one may show that we can actually write the operator 


A as: 
Asay a;P;. 
j 


Needless to say that all projection operators are observ- 
ables (as P = Pİ), but not the other way around! 


Projectors on subspaces of H. It is not hard to see that 
along these lines we can construct projectors that project 
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Figure Il.2.6: A photon polarizer. A polarizer projects the pho- 
ton onto a particular polarization state. There is a calculable 
probability for the photon to come through, after which it is fully 
polarized in the selected direction. 


on a subspace of the Hilbert space, by adding up some 
subset F of elementary projectors: 


Pre) Pj 


jer 


Such operators play an important role in the assignment of 
quantum properties to states in Hilbert space. 


Photon polarizers are projectors. Photons can be pro- 
jected on certain subspaces of the full Hilbert space, and 
these operations are quite familiar and dear to all of us. 
We can use a color filter to project on a certain subspace 
of wavelengths or frequencies. For example, you want to 
filter out the UV component of the light if you are high up 
in the mountains. But in the present context of qubits we 
should rather think of a polarizer which projects the polar- 
ization vector on a particular axis. As we have indicated 
in Figure IIl.2.6 the polarizer P,; does actually more than 
just projecting the state, it projects the in-state |+) on the 
chosen | + 1) direction of the polarizer, but then renormal- 


izes the state to a vector of length one, so the outstate 
is |+ 1). The magnitude of the incoming component tells 
you the probability that the photon will be transmitted, so 
Pout = (1/2)? = 1/2. And that is what your fancy po- 
laroid shades are really about. It is indeed a projector 
in the sense that if we let the photons that come through 
some polarizer, and subsequently let them go through an 
identical polarizer then all the photons will get through. If 
one rotates the second polarizer by 90 degrees, then that 
projects on the orthogonal subspace, and a photon that 
gets through the first polarizer will be blocked by the sec- 
ond. To check this you need two Ray-Bans, or if you are 
blessed with the curiosity of a true scientist you would hap- 
pily break the one and only one you have in two pieces of 
course. 


Note that for a large number of photons the result repro- 
duces the classical result, if one identifies the reduction in 
the light intensity due to the polarizer with the ratio of the 
number of outgoing and the number of incoming photons. 
In the classical Maxwell theory, the light intensity is given 
by the square of the electric field. The classical field E is 
literally projected, giving the factor 1/\/2 in the magnitude 
of the projected component. And its square does give the 
reduction factor 1/2., the same as in the quantum case. 
But again, for a single photon there is no classical descrip- 
tion, and to explain the single photon experimental results 
one has to go quantum. 


Raising and lowering operators 


Let me try to make you more familiar with thinking about 
dynamical variables as operators or matrices by demon- 
strating a different use of the algebra of observables as 
operators on states. You may think of a system having 
some basic operator Q with its associated eigenvalues 
and eigenstates. We also require that the system has 
some ground state that we for the moment assume to be 
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a unique lowest state |0} with Q|0) = qo|0) . Then we may 
search for operators A~ that satisfy the relation: 


[Q, AŤ] =+qA*. (11.2.11) 


Writing this expression out we obtain the following property 
of the state A*|)n), 


Q(AFlpn)) = (dn = G)(A*Wn))- 


This means that starting with an eigenstate of Q , the op- 
erators A= create again an eigenstate of Q with a higher 
(lower) eigenvalue. Such raising and lowering operators 
are extremely useful because they would in principle allow 
you to create the excited states from the ground state; they 
allow you to move through the spectrum of Q eigenstates 
and are therefore also called /addering or step operators. 
Clearly the raising operators can be written in an explicit 
form as: 


At =5,|n+1)(n]. (11.2.12) 


Such a setup works only if the eigenvalues qn are evenly 
spaced, in other words if qn = qo + nq, but this is quite 
often the case. 


Let us see how this works out for the example of the Ha- 
miltonian Hı = Z of the previous subsection. The step 
operators are now the following linear combinations: 


(11.2.13) 


and 


Z_=(Z,)'=|-1)(1] &Z = ¢ . (11.2.14) 


They are not hermitian but, as advertised, they satisfy in- 
deed the commutation relations (11.5.21) with q = 2, and 
they further more satisfy: 


[Z+,Z-] =Z, 


which is just the Hamiltonian. 


Now check that they step us through the spectrum of states. 
The ground state is in this case the state | — 1) with low- 
est eigenvalue —1. Acting with the raising operator Z4 


yields: 
sicnaisia(s ))=(): 


with eigenvalue +1 . You may want to check that the raising 
operator applied to the highest eigenstate |+ 1) yields zero 
and a similar statement holds about applying the lowering 
operator and the lowest energy or ground state. 


We may turn the argument around and say that a lowering 
operator can be used to find the ground state |o) (up to 
some constant phase factor), by requiring A_|\o) = 0 in 
the present case: 


916)-0)-0) > = 


from which follows that jpo) = |—1) , up to the phase factor 
X. 


The action of the step operators on the states is summa- 
rized in simple spectral diagram in Figure II.2.7. Note that 
the figure is also Supposed to imply the fact that 


Z+l+1)=0, 


where the ‘0" on the right-hand side is the zero vector in 
the vector space. This zero does not represent a physical 
state as it has norm zero. The spectrum is bounded: it has 
a so-called highest and lowest weight state. 


State operators. These operators and the pictures that 
represent their actions are quite useful in situations that 
are more complicated than qubits. What they allow you 
to do, is to give a different symbolic representation of the 
general qubit state (Il.1.2), as we can write: 


hb) = (a+ BZ4)|—1) = +l- 1), (11.2.15) 
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Figure Il.2.7: Step operators. The action of the step operators 
Z+ on the basis states |+) . It is also implied that Z+| + 1) = 0, 
where 0 is the zero vector, which is not a physical state. 


or alternatively: 


hb) = (aZ +B) +1) = -I+ 1). (11.2.16) 


What this equation shows is that there is a correspondence 
between states and operators, if we know either a ‘lowest 
weight’ or a ‘highest weight’ reference state |0)+ , defined 
by the conditions, 


Z_|0)_ =0 or Z,|0)4 =0, 


which, as we saw, yielded that the lowest or ground state is 
|0)_ = |—1) . What we learn is that there is an equivalence 
between specifying a state vector |p), and an operator ĝ 
that acts on a given ground state |0}. It is this perspec- 
tive which turns out to be essential for understanding the 
spectrum of quantum particles and fields. 


Figure II.2.8: Truth is in the eye of the beholder. Time’s eye 
(1949) by Salvador Dali. (@Salvador Dali, Fundación Gala- 
Salvador Dali) 


Quantum measurement 


Physics as a science is deeply empirical. Theories have 
to be thoroughly tested by experiments and have to be 
adapted or refuted if they fail to be confirmed. Experiments 
involve measurements in which the features of the pro- 
posed theory are observed by some means. This means 
that quantum theory also features the subtle, if not ex- 
otic, concepts like the linear superposition principle and 
the possibility of entangled states. The basic theoretical 
features were hard to put to test at the time when the the- 
ory was formulated, because the experimental techniques 
were not sophisticated enough to reach the necessary de- 
gree of precision. The story of quantum measurement 
therefore has a rich history. The first dramatic pseudo 
experimental developments consisted of the well-known 
‘gedanken’ or ‘thought’ experiments devised by none less 
than Einstein and Schrödinger themselves. Schrédinger’s 
cat addressed the problematic side of the outrageous idea 
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that a cat could be in a state that is a linear combination of 
a ‘dead’ and an ‘alive’ state, and we discussed it in the pre- 
vious chapter on page 266. The other is the EPR paradox, 
addressing the problematic aspect of non locality as a di- 
rect consequence of having spatially separated particles in 
an entangled state. This led to the view that quantum the- 
ory would be an incomplete theory to which ‘hidden vari- 
ables’ would have to be added to make it local and causally 
consistent. It took fierce debates like the Einstein-Bohr 
debate, and it caused a search for alternative interpreta- 
tions or even theories like the ‘hidden variable’ theory of 
David Bohm and the ‘many worlds’ interpretation proposed 
by Hugh Everett in 1958. 


Our strategy in this book is that using our knowledge of 
states and observables as discussed so far, we present 
the commonly adopted (called orthodox by some) Copen- 
hagen interpretation of measurement in this chapter, pri- 
marily because it has never been falsified, quite the op- 
posite. Indeed it has been vindicated by numerous ex- 
tremely refined recent experiments. Yet, not everybody is 
quite comfortable with the situation and we will get to some 
of the paradoxes and their (experimental) resolutions into 
more detail in Chapter II.4. 


The question of quantum measurement has two parts to 
it: part one answers the question: given that the system 
is in a state |b} what can we say about the measurement 
outcome of some observable A . And the second part an- 
swers the question: how does a measurement affect the 
state |p)? We will see that in quantum theory object and 
subject are, strictly soeaking, no longer separable. 


Probabilism.The interpretation of the wavefunction is at 
first sight quite bizarre: it is a measure for where the parti- 
cle may be found if one is to make a measurement. More 
precisely, its square gives the probability density of finding 
the particle at position x at time t . Expressed in a compact 
formula it reads simply: P(x,t) = [W(x, t)? . Probability? 
What? Didn’t we completely specify the state and now at 


once we start talking about the odds of finding the particle 
somewhere. Is that all we can do? Can't we do better? 
Good question, so, let me quote what Richard Feynman 
said on this remarkable quantum state of affairs in part 
three of his famous Lectures on Physics. 


We would like to emphasize an important differ- 
ence between classical and quantum mechanics. 
We have been talking about the probability that the 
electron will arrive in a given circumstance. We 
have implied that in an experimental arrangement 
(even in the best possible one) it would be impos- 
sible to predict exactly what would happen. We 
can only predict the odds! This would mean, if it 
were true, that physics has given up on the prob- 
lem of trying to predict exactly what will happen in 
a given circumstance. Yes! Physics has given up. 
We do not know how to predict what would hap- 
pen in a given circumstance, and we believe now 
that it is impossible — that the only thing that can 
be predicted is the probability of different events. It 
must be recognized that this is a retrenchment in 
our earlier ideal of understanding nature. 

Richard Feynman, Lectures on Physics, Part III 


This quote characterizes the dramatic change of perspec- 
tive on our capability to ‘understand’ the fundamental prop- 
erties of nature. It was in fact the Austrian physicist Max 
Born who forcefully argued for this probabilistic interpre- 
tation of quantum mechanics, and he received the Nobel 
prize in 1935 for this work. This interpretation is usually re- 
ferred to as the Kopenhagener Deutung, or Copenhagen 
interpretation, of quantum mechanics. 


Classical versus quantum measurements. Measure- 
ment in classical physics is conceptually rather trivial: One 
simply observes the classical state variables with a finite 
precision and thereby approximates the variable as a real 
number with a finite number of digits. The accuracy of 
measurements is limited only by background noise and 
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the precision of the measuring instrument. The crucial 
assumption is that one can make any such measurement 
without changing the state of the system. This implies that 
the order in which one makes measurements is irrelevant, 
and therefore there is no restriction on which variables 
could be measured ‘simultaneously. 


In the quantum setup we describe a particle with a wave- 
function which may be spread out over all of space. The 
fact that the wavefunction is spread over all of space, how- 
ever, does not mean that the particle is at many places si- 
multaneously, or that we could observe it in different places 
at the same time. It does not even mean that the particle is 
actually in some definite place and that we only happen to 
just not know where it is. The particle state is a probability 
amplitude, referring not to the probability where the parti- 
cle actually is but to where it might be found upon making 
a position measurement. As we will see it basically doesn’t 
make sense to talk about where the particle is before we 
observe it. In general the wavefunction tells us that the 
particle is, rather than where it is. 


Indeed, that situation is quite different from the proposition 
that we know someone is in a room behind a closed door, 
and we do not know where in the room this person exactly 
is, because in that case we know for sure that the person 
will be definitely somewhere and we may assign a certain 
probability distribution as to where she is. That distribution 
however reflects our ignorance, our not-knowing the exact 
state. It describes our lack of knowledge as observer, not 
the actual state this person is in. 


In quantum theory a given extended wavefunction speci- 
fies the state of the particle completely, and knowledge of 
that state does not allow us to deduce where the particle 
is; its position is just not determined, in that state it has 
no position a priori and it therefore makes no (quantum) 
sense to talk about it! The fundamental difference between 
a possible classical probability which reflects our lack of 
knowledge about the system, and the inescapable 


Leaving a trace. A misleading aspect 
of measurement theory is that the term 
measurement suggests that it is nec- 
essary to have an experimenter who is 
handling some intricate device to collect data. This 
is not the case. As a matter of principle, it only 
matters that the system interacted with something, 
somewhere, at some time, and that that interac- 
tion affected the state of the system. The interac- 
tion may have left a trace somewhere, an indelible 
mark, without any experimenter caring about it or 
even being aware of it. In that sense the notion of 
measurement is much more abstract, and less an- 
thropocentric than you might have thought. It is like 
‘forensic science, where one is searching for traces 
of past interactions call it of ‘measurements’ — that 
took place a long time ago: finger prints, car keys, 
or sunglasses left on a table, or phone calls, and 
photographs left on a remote server. A measure- 
ment is anything that leaves some discernible trace 
somewhere, at some instant in time. 

So if | engage into an interaction with a particle, its 
behavior may have been influenced by previous in- 
teractions | have no knowledge about, and that may 
in turn lead to unexpected outcomes in my experi- 
ment. Something | better be aware of. It is the hid- 
den constraints that often present an invisible yet 
fatal flaw. We return to these questions in Chap- 
terll.4. 0 


uncertainty that occurs even if we know the state exactly is 
that the quantum probability refers to an intrinsic property 
of the system and not to the state of knowledge that an 
observer like you or me might or might not have about that 
system. Yet, at the same time, the state limits fundamen- 
tally what an observer could possibly get to know about 
the system. As a consequence the measurement process 
in quantum mechanics is not at all trivial. 
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Another notable difference with classical mechanics is that 
in many instances the set of observable states is discrete, 
with quantized values for the physical variable. It is this 
property that has given the theory of quantum mechanics 
its name. 


Maybe the most profound difference is that quantum mea- 
surement typically causes a radical alteration of the state 
vector. Before the measurement of an observable we can 
only describe the possible outcomes in terms of proba- 
bilities, whereas after the measurement the outcome is 
known with certainty, and the wavefunction is irrevocably 
altered to reflect this. In the Copenhagen interpretation of 
quantum mechanics the wavefunction is said to ‘collapse’ 
when a measurement is made. 


In spite of the fact that quantum mechanics makes spec- 
tacularly successful predictions, the fact that quantum mea- 
surements are inherently probabilistic and can ‘instantly’ 
alter the state of the system in such a disruptive man- 
ner has caused a great deal of confusion and controversy. 
In fact, one can argue that historically the field of quan- 
tum computation emerged from thinking carefully about 
the measurement problem. 


No cloning! 


If measuring a quantum state changes it, you may wonder 
whether it is not a smart idea to copy such a state, be- 
fore making the measurement. Take one and make two 
identical ones out of it by using a quantum Xerox ma- 
chine. The answer is simply that this just cannot be done. 
Quantum copying is a no-go! This exceptional feature 
create the possibility of a novel type of ‘quantum secu- 
rity? Information that cannot be copied without destroying 
it. This makes the no-cloning principle a blessing in dis- 
guise. 


What | am trying to tell you is that reading a quantum book 
will change it in unpredictable ways. You might actually 
want to avoid trouble with the librarian by copying the quan- 
tum book before reading it. But even this precautionary 
measure is obstructed by a quantum no cloning theorem, 
which was first formulated by William Wootters and Woj- 
ciech Zurek and by Dennis Dieks in 1982. 


Suppose | have one particle in a particular state, and | 
want to bring another particle into exactly the same state. 
Then | have to look at the state of particle one in order to 
know what state to bring particle two in. But, by doing so, 
| have to affect the state of particle one. The best | can 
do in general is to bring particle two in the state particle 
one was in before, but then particle one is no longer in 
that state. This remarkable property can be shown to hold 
rigorously: quantum states cannot be copied, but they may 
be transferred from one system to another. And thinking 
in terms of securing information and beating our National 
Security Agencies with respect to protecting our privacy, 
this no-cloning may turn out to be a blessing in disguise. 
And it is. 


More precisely, the no-cloning theorem amounts to the 
statement that for an arbitrary state |\);) on one qubit and 
some particular state |) on another, there is no quantum 
device [A] that transforms |) @ |b) > |b7) 8 |), i.e. 
that transforms |p} into |w;), while leaving the old |p) un- 
affected. If Ua is the unitary operator representing A , this 
can be rewritten |p1)lp1)} = Uald)|)1) . For a true cloning 
device this property has to hold for any other state |2) as 
well, and we must also have |w2)|tp2) = Uald)|tp2) . It is 
not hard to demonstrate that the existence of such a device 
leads to a contradiction. Since (|) = 1 and ut Ua =la 
the existence of a device that can clone both w, and wW2 
would imply that 


(pilpa) ((tbil(1) (Up)hp2)) 
= (Applu) (Ualphpz)) 


= (Apila php) = Abi hp2)? . 
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The property (pilhp2) = (pihp2)? only holds if wp, and 
wz are either orthogonal or aligned meaning that either 
(1 /tb2) = O or 1. It does not hold for arbitrary values of 1 
and wz , so there can be no such general purpose cloning 
device. In fact, in view of the uncertainty of quantum mea- 
surements, the no-cloning theorem does not come as a 
surprise. If it were possible to clone wavefunctions, it would 
be possible to circumvent the uncertainty of quantum mea- 
surements by making a large number of copies of a wave- 
function, measuring different properties of each copy, and 
reconstructing the exact state of the original wavefunc- 
tion. 


The probabilistic outcome of measurements 


In the formalism of quantum mechanics the possible mea- 
surement outcomes of an observable quantity A are given 
by the eigenvalues of the matrix A . For example, the three 
Pauli matrices, defined in equation (II.2.2), all have the 
same two eigenvalues A+ = +1. This means that the pos- 
sible outcomes of a measurement of the spin in any direc- 
tion can only be plus or minus one. This is fundamentally 
different from a spinning object in classical physics, which 
can spin at any possible rate in any direction. The ob- 
served value of any component of a classical spin in this 
picture could be any real number between —1 and +1. 
This confirms that quantum mechanics is counter-intuitive 
and subtle indeed. 


If a quantum system is in an eigenstate of an observable, 
then the outcome of measurements of that observable is 
100% certain. For example, imagine we have a qubit in 
the state with x = 1 and B = 0, so that |) = | + 1). It is 
then in the eigenstate of Z with eigenvalue z = +1 and the 
measurement of Z will always yield that value. This is de- 
picted in Figure II.2.9(a), and is reflected in the mathemat- 
ical machinery of quantum mechanics by the fact that for 
the spin or polarization operator in the z—direction, A = Z, 


the eigenvector with eigenvalue A, = +1 is | + 1) and the 
eigenvector with A_ = —1 is |— 1) . In contrast, if we make 
measurements in another direction, e.g. A = X, the out- 
comes become probabilistic. The outcome is still +1 or 
—1, but there are calculable probabilities for each value to 
occur. So the take-away message here is that it is not the 
values of possible outcomes that change, only the proba- 
bility by which they will occur. Quantum theory is dealing 
with ‘certain uncertainties’, so to say. This is depicted in 
Figure Il.2.9(d). The eigenvectors of X are: 


= faery + 1)) and Hy = fain |= 1) 


In general the probability of finding the system in a given 
state through a measurement is computed by first writing 
the given state |) as a linear combination of the eigen- 
states |a,) of the matrix A corresponding to the observ- 
able, i.e. 


tb) =) Bxlax) with Bk = (axhp) . 
k 


The notation (ax|p) means that the component fx is in- 
deed equal to the projection of the state vector |p) on the 
eigenvector |ax) . The probability of measuring the system 
in the state corresponding to eigenvalue ax is then given 
by 

Pr = IBk? = laxhp)/. 


As we discussed briefly before, this is why the coefficients 
Bx in the expansion of the state |p) in a set of eigenstates 
of some observable are called probability amplitudes, am- 
plitudes because it is only after squaring them that one ob- 
tains the probabilities for a certain measurement outcome. 
And the normalization condition on the state vector is just 
the statement that the total probability to find the system in 
one of the allowed states, equals one. The other two pic- 
tures of Figure II.2.9 give smilar distributions for an incom- 
ing |+) state. In Figure II.2.10 we given the corresponding 
distributions of electrons hitting the screen perpendicular 
to the beam. This is what one sees preparing the beam 


(11.2.17) 
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in the incoming state and then measuring its polarization 
along some given axis. 


So, what constitutes a measurement? | have been some- 
what cavalier in talking about the notion of a measurement, 
while showing you nice and clean figures of some idealized 
experiments. Indeed at this stage, where we for example 
talk about spin polarization measurements, we have a sit- 
uation in mind where we distinguish three stages in a mea- 
surement experiment. 


(i) A preparatory stage, where we prepare the particle(s) 
so that the spin is in the desired state. For example we 
have electrons coming in and by using a Stern—Gerlach 
device (this will be explained in the next chapter) we can 
split the beam into two with opposite polarizations along an 
axis one may choose. This way one may prepare a beam 
of spins in some definite and identical polarization state up 
to an overall phase. 

(ii) A first stage of the measurement, where we let the pre- 
pared beam sequentially interact with some other devices, 
which make up the experiment. 

(iii) The second and final stage of the measurement, where 
we actually have a ‘screen’ or other counting device. So, 
in the end we measure a probability distribution that can 
be compared with a theoretical prediction, and potentially 
falsify our theory. 


The purist may say that only the very last stage constitutes 
the measurement, so where the distribution over the sam- 
ple spaces of some pre-chosen set of observables is ob- 
tained by projecting the outcoming particle states. 


The projection postulate 


In classical physics, science started from the be- 
lief — or should one say, from the illusion? — that 
we could describe the world, or least parts of the 
world, without any reference to ourselves. 

Werner Heisenberg 


Apart from the probabilistic nature of measurement out- 
comes, a second remarkable aspect of quantum measure- 
ment is the fact that the act of making a measurement will 
generically change the state of the system. It is disruptive 
and will cause what is known as a ‘collapse of the wave- 
function.’ The mechanism is also known as the projection 
postulate, which was formulated by John von Neumann in 
the early days of quantum mechanics. This postulate is at 
this point an extra and in fact ad hoc postulate. Ad hoc, 
because the measurement process itself is just a quan- 
tum process and therefore should be completely described 
within the framework of the theory. The outcome should be 
‘calculable’ from first principles and cannot be decreed by 
an additional postulate. In the end it is to be decided by 
ever more precise measurements whether or to what ex- 
tent the postulate really holds and correctly represents all 
possible choices. But even then, the postulate including its 
range of validity should be ‘proven’ from first principles. 
This being said, the reason this is so hard is because a typ- 
ical realistic measurement device is a macroscopic, classi- 
cal machine. So what | just said will be extremely compli- 
cated, because you have to model the effective interaction 
between quantum and classical degrees of freedom, ba- 
sically by going all the way down to the quantum level in 
describing the apparatus. 

In an operational sense the projection postulate so far has 
been confirmed by basically all experiments dedicated to 
test it. It is this ‘success’ which causes that the terminology 
and related picture of the measurement process persist in 
the mindset of most quantum practitioners . 


Over the last few decades, physicists like to distinguish so- 
called strong and weak measurements. Let us comment 
on them subsequently. 


Strong measurements. The strong measurements are 
the most common ones. One observes a particular eigen- 
value as we discussed, and the system makes then a tran- 
sition exactly to the corresponding eigenstate. This type 
of measurement does confirm the postulate by definition. 
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(a) Measurement spin polarization along z-axis, of the state |p) = |1). (b) Measurement spin polarization along z-axis, of the state |p) = |+). 
Outcome: probability pz(+1) = 1 and pz(—1) = 0. Outcome: p-(+1) = pz(—1) = 1/2. 


Meal |1> Ne Ul 


(c) Measurement spin polarization along x-axis, of the state |») = |+). (d) Measurement spin polarization along x-axis, of the state |p) = |1). 
Outcome: px (+1) = 1 and px(—1) =0. Outcome px(+1) = px(—1) = 1/2. 


Figure II.2.9: Spin polarizations. Graphical representation of spin polarization along different axes. The projections of the red state 
vector |) along the axes of the measurement frames gives the probability amplitude for the outcome to be plus or minus one. 
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(a) Measurement spin polarization along z-axis, of the state |») = |1). (b) Measurement spin polarization along z-axis, of the state |p) = |+). 
Outcome: probability pz(+1) = 1 and pz(—1) =0. Outcome: p-(+1) = pz(—1) = 1/2. 
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(c) Measurement spin polarization along x-axis, of the state hp) = |+). (d) Measurement spin polarization along x-axis, of the state |p) = |1). 
Outcome: px(+1) = 1 and px(—1) =0. Outcome px (+1) = px(—1) = 1/2. 


Figure II.2.10: Spin polarization measurements. We have visualized the probability distributions discussed in the previous figure, in 
counts on a z — x screen. The incoming beam is coming down along the y-axis after passing through a polarizing beamsplitter. The 
width of the distribution is supposed to reflect the width of the beams. 
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Figure II.2.11: Projective measurement. For the incoming state 
hp) there is a probability px , equal to the projection of the state 
on the eigenvector squared, to observe the (eigen)value ax . 


What happens is depicted schematically in Figure Il.2.11. 
We start with a system in some state |p) and we make 
a measurement of the observable A and find a value ax, 
then the act of making the measurement changes the state 
lp) to the state ja,), the eigenstate of A with observed 
eigenvalue ax . What this means is that if we would act with 
A again immediately after, we would measure that same 
eigenvalue with 100% probability, and that seems like a 
reasonable thing to expect. 


Weak measurements. Fortunately, one is of course free 
to invent whatever smart measurement schemes one wants 
to pursue, in order to — in a more subtle way — extract more 
information than the projection postulate would allow you 
to. This has lead to an interesting debate within the phys- 
ics community about so-called weak measurements and 
weak values. 


The idea is to make measurements where the interaction 
with the system is sufficiently weak so that it does not af- 
fect the incoming state. Yet, there is the possibility to ob- 


serve a ‘weak value’ which would tell us ‘something extra’ 
about the state of system. As the state hasn’t changed af- 
ter the weak measurement, a strong measurement of an- 
other incompatible observable, made right after the weak 
one would not be affected. You should think of this as the 
subtle changes in the screen patterns of Figure II.2.12, like 
a small displacement in one of the peaks. 


We have seen that a projective measurement with its col- 
lapse of the wavefunction amounts to a major disruption of 
the system, and here we consider the possibility to perturb 
the system in a subtle way, meaning weakly. These weak 
measurements may tell us something about the state of 
the system without really making a complete projection. 
In Figure II.2.12 we have depicted a scheme proposed by 
Aharonov, Albert and Vaidman, and show what happens 
to the particle distributions after we do such a weak mea- 
surement. We have incoming particles in a state hp) = 
(+) + V2|—))/V3. In Figure II.2.12(a) we have the incom- 
ing beam and do no polarization measurement. In the sec- 
ond Figure ll.2.12(b) we measure the polarization along 
the x-axis, and we see the expected splitting, with outcome 
Px(+1) = 1/3 and px(—1) = 2/3. In Figure II.2.12(c) we 
start with an incomplete polarization measurement along 
the z-direction, which means that we apply a weak field 
so that the beam does not really split. This amounts to 
a small perturbation of the incoming beam. However, if 
directly after the weak measurement, we measure the x- 
polarization of the perturbed beam we observe a small 
displacement of the weak peak in the z direction as indi- 
cated in Figure II.2.12(d). The projection along the x-axis, 
however, takes place as usual, but one has succeeded 
in getting some extra information on the ‘incompatible’ z- 
polarization. It is this tiny shift in the z direction which 
amounts to the measurement of a weak value. 


So here we have an example that illustrates the subtlety 
of the notion of measurement, the clue being that we have 
concocted a setup where we go beyond a simple projective 
measurement. It underscores that all interactions in some 
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(a) Measurement spin polarization of the state |p) = (+) +v2|—))/ v3. (b) Measurement spin polarization along x-axis, of the state. Outcome: 
Polarizers are turned off. Px(+1) = 1/3, px(—1) = 2/3. 


(c) A weak measurement of the spin polarization along z-axis, of the (d) Measurement spin polarization along x-axis, of the perturbed state. 
same state, yields a perturbed state. Outcome px (+1) = 2/3 and px(—1) = 1/3. However the small peak is 
slightly shifted. 


Figure I1.2.12: A weak spin measurement. The incoming beam is coming down along the y-axis after passing through a z- and/or 
x-polarizing beamsplitter. The width of the distributions reflects the width of the beams. The results are explained in the text. 
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Figure II.2.13: Logic and syntax. In search of semantics? 


way could be called a measurement, if you are willing to 
stretch the semantics of the term measurement. 


Quantum grammar: Logic and Syntax 


In the classical situation we speak of the phase space 
of a system, to be contrasted with the Hilbert space for 
quantum systems. The fundamentally different structure of 
these two spaces has profound consequences for the log- 
ical and deductive structure of these theories. Whereas in 
the classical case properties of the system generally can 
be associated with subspaces of the total phase space, 
one has on the quantum level to distinguish the space 
of observables from the Hilbert space, and choose from 
possible consistent frameworks which are more restrictive. 
Within a framework certain properties can be unambigu- 
ously assigned, and deductive logic can be applied. This 
is illustrated for the cases of a qubit and a particle. 


Compatible observables allow for joint eigenstates and thus 
for those states one may assign a point in the joint sample 


space. A maximal subset of independent observables that 
are mutually compatible defines a consistent framework F 
to describe the system with. With the framework comes a 
sampling space S which is a kind of quantum equivalent 
of the classical phase space. So for the qubit example this 
is clear. A consistent framework could correspond to the Z 
observable, and we may describe all states of the qubit, as 
(normalized) linear combinations of the basis states | + 1) 
which are the eigenvectors of the Z observable as it makes 
up the framework. 


The framework for a quantum system is not unique, and 
the choice of framework depends on what question one 
wants to address and what aspect of the system one wants 
to study. If you make position measurements you use the 
Z-framework, and if you make momentum measurements 
you choose the X-framework. Let me emphasize however 
that a quantessence here is that there are observables 
which are not compatible with the framework. Logically 
speaking what this implies is that the observables incom- 
patible with the particular framework you are using can- 
not be assigned a meaning. They are meaningless in that 
framework because there is no logical way one can decide 
whether a property referring to the values of incompati- 
ble observables is true or not. Henceforth quantum theory 
has well-defined observables that have the unusual fea- 
ture that they cannot be part of a logically sound deductive 
argument within a given framework. Let us take a closer 
look at this statement and find out what this means for a 
classical particle and its quantum descendent. 
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RA Collapse of the wavefunction. In fig- 
\ 3 ure (a) below we give a graphic impres- 
Ri sion of what is called ‘the collapse of 
the wavefunction. If you think of the 
wavefunction as a probability amplitude, it makes 
actually a lot of sense, because you would expect 
that repeating the same measurement immediately 
after you have made the observation a; would give 
exactly the same outcome with 100% certainty. But 
that can only be the case if the state has changed 
to the corresponding eigenstate |a,) as decreed by 
the projection postulate. So the term ‘collapse of 
the wavefunction’ suggests that there is a violent 
physical action at a distance going on if we make 
a measurement, but that is totally misleading. The 
wavefunction which indeed encodes all there is to 
know about the state of the system represents a 
probability amplitude, and making a measurement 
can drastically change the probability of future mea- 
surement outcomes. 
This is a familiar phenomenon. If | know that you 
are somewhere in town, | may have a rather uni- 
form probability distribution for where you are that 
stretches all the way to the outskirts of the city. If 
you then suddenly happen to walk into my office, my 
probability distribution will indeed instantaneously 
collapse to some narrow spike that peaks right in 
front of my desk. But that doesn’t mean that some- 
thing is physically changing on the outskirts of town, 
nor will you be affected. 
The quantessential difference between the quan- 
tum case and you is of course that the distribution 
| had in my mind about you was certainly not all 
there was to know about the system called ‘youl’. It 
had more to say about my state of ignorance than 
about you. The measurement did not affect you nor 
places where you could have been. Apparently in 
quantum theory the strict separation of object and 


subject that reigns in classical physics is no longer 
valid: no longer any neutral observers, no peeking, 
or looking without touching. 

In the classical context, the separation of object 
and subject is based on the assumption that it 
is in principle possible to make the effect of the 
measurement on the system arbitrarily small. This 
is no longer true in quantum theory. 


lin> = y> 


Position 
Measurement 


Win)=<x | in> Wout(X)=<x lout > 


(a): Collapse of the wavefunction. A state |p) comes in 
and a measurement of the observable A is made. This 
yields with a probability pn the outcome xn € x, and the 
state |) instantly ‘collapses’ to the state |x,) . 


Sure enough, given a particular state there may be 
an appropriately chosen measurement that does 
not change the state, but in general it does change 
the state. So imagine how strange it would be if, af- 
ter you read that quantum book, it changed. Never 
a dull moment, but alas nobody could guarantee 
you that the book would still make sense after you 
read it. A recipe for great applications in social 
media | think. 
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(b) Classically the state of a particle in one dimension is 
defined by its position x and momentum p , which define 
a point in its phase space Fyn. 


== = 


c) The region corresponding to the proposition A: xo < (d) The proposition B: 0 < p < pı is true for all points 
x 2 x1 is shaded blue. It is true for a state if the point in the dark red shaded region. 
representing that state is in the blue region. 


e) The conjunction ‘and’ denoted as A ^B corresponds (f) The conjunction ‘or’ denoted as A V B corresponds 
Ns he bright red region. to the green region. 


Figure II.2.14: Propositions in classical physics. Propositions about the the position x and momentum p of a particle in one dimension 
and their conjunctions. 
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The case of a classical particle 


Position and momentum are the basic observables that la- 
bel the dynamical state of a particle which corresponds to 
a point in the phase space of the particle as illustrated in 
Figure II.2.14(b). These are basic because in the Newto- 
nian ‘framework’ one has to specify the momentum and 
position at some initial time. Then the states at any other 
time would be determined provided we know the force act- 
ing on the particle. The fact that the momentum and posi- 
tion variables are basic also implies that other dynamical 
variables like energy can be expressed in them. 


We can make propositions involving properties of particu- 
lar states of the particle and find a yes/no answer to whether 
that proposition is true or false. Not only can we answer 
questions about the elementary properties but also about 
conjunctions of those. For example, we may ask whether 
a state has the property A: x9 < x < xı. Then for all 
points x,p in phase space in the blue shaded region of 
Figure II.2.14(c) the answer is yes, and outside that region 
it would be no. 


So we can assign a truth value ‘1’ or ‘0’ to the proposition 
A accordingly. Similarly we may ask for the p value to 
satisfy 0 < p < pı and define it as proposition B, and 
then we get the picture of Figure II.2.14(d). Now we can 
ask for combined properties of x and p. For example, if 
may ask whether the property A ^ B ( A and B) is true or 
not. The truth value of this conjunction can be calculated, 
and for the case at hand it equals the product of the truth 
values of A and B . This assignment requires of course that 
AB = BA, which means that the point has to be located 
in the bright red shaded rectangle as indicated in Figure 
11.2.14(e), the region that is the intersection of the shaded 
regions in the two previous figures. 


Similarly, one may ask whether xo <x < xı or 0 < p < pı 
is valid, which means that we ask whether the property 


Table 11.2.1: 
ure II.2.14 


Truth table for the propositions made in Fig- 


aK ay oh fol < 


A V B is true or not. This proposition is in the picture rep- 
resented as the union of the shaded areas, which is the 
green shaded area in Figure II.2.14(f). Formally the truth 
value can be calculated by the formula A + B — AB. The 
figures can be summarized in a conventional truth table 
as shown above, exactly as they are used in elementary 
(propositional) logic. So to find the properties of the clas- 
sical particle, the physicists infer these from the rules of 
a simple deductive logical scheme that is mathematically 
represented by a Boolean algebra with variables that can 
only take two values, zero (false) or one (true). 


The case of a quantum particle 


Let us now sketch what happens to the particle in the 
quantum arena. There is again a basic set of quantum 
observables ‘X’ and ‘P’. And again one may ask at any 
moment what the value of any of the observables is and 
verify by measurement whether the proposition is true or 
false. 


Sampling spaces. Here we first have to address the ques- 
tion of what the sampling spaces for these observables 
are. Let us allow two possibilities for the space in which 
the particle moves: it could be infinite and correspond to a 
straight line or it could be finite, say, a circle. The possible 
outcomes of position measurements would of course cor- 
respond to points in these spaces, meaning that the sam- 
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_— infinite space: 
p continuous 


Figure 1.2.15: Sample space of momentum. The sample 
space S, for the momentum observable in the quantum case 
depends on the topology of the (continuous) configuration space 
X in which the particle moves. 


ple space Sx ~ æ. However, the sample space for the 
momentum observable turns out to depend on the topol- 
ogy of the underlying configuration space. 

If the particle lives on the real line and ¥ œ R, then the 
possible values for the momentum variable are continuous 
just like the position variable, S, ~ R. 

On the other hand, if the configuration space would be 
a circle X ~ S! , then, as Bohr told us, the spectrum of 
the momentum becomes discrete and would in fact cor- 
respond to the set of integers denoted by S, ~ Z. We 
will treat the case of a particle on a circle in detail in the 
Chapter II.5. We have indicated the two possibilities in Fig- 
ure 1.2.15. 


There is a third possibility here, that at first may strike 
you as utterly pointless but turns out to be quantessential 
and should not be overlooked. Imagine that the position 
space itself is discrete and infinite, like a one-dimensional 
lattice Z, then, one should expect to find that the sam- 
ple space for the momentum becomes a circle, S, ~ Sl. 


Figure l1.2.16: Sample space of position. The sample space 
for the position observable is the real line. We indicated two 
possible propositions A and B, and their conjunctions. 


The momentum in that case becomes an angular variable 
0<@< 27. 


Going yet one step further, we can also ask what hap- 
pens if the position space is discrete and finite, for exam- 
ple cyclic like the corners of a polygon, then interestingly 
enough the sample space of the analogue of a momen- 
tum observable associated with the particle hopping from 
one state to another would also becomes periodic and dis- 
crete. We have already run into the simplest example of 
this, a space with two points being just a classical bit, or 
classical Ising spin, which as we saw on a quantum level 
gives rise to the qubit or quantum spin. For that case it 
turned out that the position observable Z had two eigen- 
values +1, and the same was true for the ‘momentum op- 
erator X. As you see, we have managed to wrap a whole 
lot of quantessence in a qubit, and will continue to do so. 
This concludes our discussion of a first crucial difference 
between the classical and quantum sampling spaces of a 
‘particle’. 
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Incompatible observables. The second difference is far 
more dramatic. It turns out that the position and momen- 
tum observables are incompatible, which means that a 
consistent framework for the quantum particle can only 
be based on either the momentum observable or on the 
position observable. So, in going from classical phase 
space to the quantum space one can chose the momen- 
tum sample space indicated in Figure II.2.15 or for exam- 
ple the position space of Figure I|.2.16, and we ‘loose’ the 
orthogonal dimension. The amputation of half the num- 
ber of dimensions is quite an operation and | can imag- 
ine that you, following our discourse, may suffer from a 
kind of ‘phantom pain’ like experience. This loss implies 
a quantessential restriction on what can be considered ‘a 
meaningful statement’ about properties of the system, and 
at the same time creates ample room for void statements 
and ‘fake news.’ 


What we just said also means that the quantum extension 
of our deductive logic gets severely restrained. Clearly if 
we compare the possible properties of a classical particle 
illustrated in Figure II.2.14 to the possible properties of a 
quantum particle given in Figure II.2.16, these are radically 
different. Most importantly we cannot assign properties to 
the P and X observables simultaneously, and hence can- 
not carry over the classical picture at all. What is left on 
the quantum level is that we may assign properties and 
ask for their conjunctions as long as they refer to one of 
the two observables, and this is illustrated in Figure 1.2.16 
where we did define two propositions A and B pertain- 
ing to the position variable and their logical conjunctions 
A/\B and A V B. In conclusion, we note once more that 
because quantum operators in general do not commute, 
axes prominently present in the classical picture may be 
completely absent on the quantum level. This does not 
mean that the ‘lost’ observable X or P has taken the value 
zero and we have left out the corresponding axes. No, 


ĉIn fact one may choose any linear combination of the two, but for 
the moment we choose this simple restriction. 


it says that a variable which is not part of the framework 
has no meaning let alone a value, and the axis is just not 
there! 


We will run into these kind of situations repeatedly, where 
before making any strong statements on the properties 
of a state of a quantum system, we have to be explicit 
about the framework we are using. In quantum theory 
we apparently have one complete, consistent and rigor- 
ous mathematical formalism that supports many logically 
distinct frameworks. This may remind you of special rela- 
tivity where one also distinguishes many reference frames 
which are relativistically equivalent, as they can be trans- 
formed into each other by a Lorentz transformation. But 
to make an argument you better do not mix up statements 
that hold in different frames. And here we are finding many 
frameworks which are quantum (or unitarily) equivalent but 
making a physical argument, you better stick to one if you 
want to keep your physics straight. 


This may at first sight look strange and unfamiliar and a 
heavy load of reader unfriendly jargon, but at the same 
time it is a precise, concise and explicit statement of what 
states, dynamical variables and measurements in quan- 
tum theory are about. And it is this core structure of the 
theory that we want to extensively explore in the remain- 
der of this volume. This exposition has hopefully made you 
feel more comfortable with it, because from the underlying 
mathematical structure lots of quantessential properties 
can be derived. These quantessential properties, which 
to the classical mind may appear exotic to say the least, 
are falsifiable at least in principle, and have turned quan- 
tum physics into a full-fledged scientific theory. The con- 
struction of this solid mathematical framework was largely 
the brilliant work of the second generation of outstanding 


quantum physicists, like Werner Heisenberg, Erwin Schrédin- 


ger, Paul Dirac, Max Born and John von Neumann to men- 
tion a few. 
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The case of a quantum bit 


Philosophers talk about an ontology in which the quantum 
reality could be understood and categorized. What are its 
basic entities, what are their measurable properties and 
what are the rules governing them? One likes to under- 
stand what the propositions or properties are that are ei- 
ther true or false. And as we have seen in quantum theory 
the rules about observables appear to be rather bizarre, 
and therefore it is illuminating to study their logical struc- 
ture in more detail. 


Projection operators. It is convenient to go back to some 
of the statements we made on page 292 of the previous 
section. Suppose we have some Hilbert space H and a 
suitable set of observables that are mutually commuta- 
tive and their common eigenvectors {|i)} span H . Or we 
could construct a single observable which would be non- 
degenerate and therefore satisfy 


Ali) = aili), 


with all its eigenvalues a; being different. Then we could 
consider the elementary projectors: 


Pi =i äl, 
which satisfy: 
i Pi =1, 


and therefore we can introduce its logical negation ~P; = 
1 — 24 Pj, which is of course also a projection opera- 
tor that projects states on the subspace orthogonal to |i) . 
These projectors all commute; furthermore the observable 
A can in this basis simply be expressed as 


A = Sal 


with the eigenvalues as coefficients. The Hamiltonian op- 
erator for example can be written as: 


H = EnEn hpn) prl. (11.2.18) 


Let us verify some of the equations above for the Pauli 
matrices. The projection operators would correspond to 
the matrices: 


Pr=iai=(5 S) Pa == hg §)- 
(112.19) 


These operators commute and indeed P; + P_; = 1. The 
observable Z can be expanded in the projection operators 
as Z = P; — P . Just for completeness we also give the 
expressions related to the observable X: 


e orenneed(), 3) 


and similar properties hold. 


With these projectors we may now associate properties or 
propositions that may be true or false in the sense that if we 
measure A and obtain some particular outcome ax , stipu- 
lating that Px is 1 (true), and all other P; are O (false): 


Pkk) = 1 Ik) 


You may verify this outcome from the examples above. 


Non-commuting projectors. So far so good, but what 
happens if we want to define elementary conjunctions be- 
tween properties, say we want to ask whether P or Q 
(P V Q) is true. From Table II.2.1 one learns that such a 
proposition would correspond to the truth value of the pro- 
jector PQ or QP. The logical proposition P and Q , (PAQ) 
has truth value P + Q—PQ, and also involves the product. 
But now we run into a problem because the product of two 
projectors is again a projector only if they commute. So in 
quantum mechanics neither PQ nor QP can in general be 
true or untrue, and this poses a fundamental problem from 
an ontological point of view. 


Consider in the qubit example above, for instance the propo- 
sition Pı V P4 . This would have to correspond to the prod- 
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uct operator 


1/1 1 1/1 0 
P= Ps =3 (4 J or Pu = PaP = 5 (j Ne 


but these are different and moreover neither of them is a 
projection operator (Po. Æ Pj) to which truth values could 
be assigned. In the language used before we say that Z 
and X are indeed incompatible observables. 


The choice of a framework. We can now avoid some 
of this by demanding that we only use a set of mutually 
commuting projectors or a set of compatible observables, 
linked to a given basis defined by some generic observ- 
able. Such a framework does indeed limit the number 
of properties that can be assigned to the system. But 
adopting such a framework one can use ordinary deduc- 
tive logic concerning the restricted set of properties of the 
system. 


And conversely a state can only have or not have a prop- 
erty a; if we work in a framework where we can assign a 
truth value to its associated projector Pi. So, other non- 
commuting observables simply have no meaning in such a 
framework. And we have to think of such states in terms of 
a probability amplitude over the sample space connected 
to the framework one happens to be working with. There 
are many inequivalent such sets and it depends on what 
aspects of the theory one wants to study which one to 
choose. This observation suggests the use of the notion of 
a single framework, as a set in which to describe quantum 
states and also the propositions about the system which 
are meaningful in that framework. This defines an addi- 
tional syntactic rule which forbids employing incompatible 
frameworks into a single description of the properties of 
the system. This is central to what is sometimes referred 
to as the new quantum logic. 


In this single framework setting of quantum mechanics we 
return as closely as possible to a classical description of 
states with definite properties and statistical distributions 


over sample space. Describing the dynamics in such a 
single framework makes the quantum time evolution into 
some quite ordinary stochastic process as we will point 
out later. 


Certain uncertainties 


Nothing [in quantum theory]... was more startling 
than Heisenberg’s uncertainty principle, which de- 
nied the possibility of simultaneously measuring cer- 
tain properties of motion. The uncertainty princi- 
ple introduced us to quantum fluctuations, reveal- 
ing empty space to be in fact a cauldron of activity. 
John Archibald Wheeler, 
Geons, Black Holes & Quantum Foam (1998) 


Early on in the development of quantum theory it was Wer- 
ner Heisenberg who proved his fundamental uncertainty 
relations stating the impossibility of simultaneously mea- 
suring certain variables that characterize the state with ar- 
bitrary precision. There is a fundamental limit to the accu- 
racy of quantum measurements set by Planck’s constant. 
These relations, more than anything else, express the pro- 
found difference between classical and quantum systems. 
We discuss the position-momentum uncertainty relation 
for a particle state, and work out the detailed example for 
a qubit. 


Momentum versus position. Accepting that the state is 
completely specified by a wavefunction that will only tell 
you the probability amplitude for finding certain outcomes 
for any given observable another question remains: what 
does the wavefunction say about the momentum of the 
particle? There is no mention of momentum, it doesn’t 
seem to play any role whatsoever in the definition of the 
state. This seems perfectly alright in view of what we have 
been talking about in the previous section on compatible 
observables and frameworks. All true, but could | not per- 
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Figure II.2.17: Pointillism. Detail (bottom) of the pointillist paint- 
ing ‘A Sunday Afternoon on the Island of La Grande Jatte’ (top) 
by the French painter Georges Seurat. Painted some years be- 
fore the moment when Planck made his groundbreaking quan- 
tum hypothesis, this work showed how a closer look may reveal 
a quantum structure.(Source: Wikimedia.) 


fectly well decide to go out and just measure it, couldn’t 1? 
Yes, you certainly can and you would indeed get a definite 
answer. But the story is the same as with the position mea- 
surement. Say, if you prepare a particle in a certain state 
described by some wavefunction po(x) and you measure 


a value for the momentum p = po. Then you could repeat 
the whole procedure and somehow again prepare the par- 
ticle in exactly the same initial state and then once more 
measure its momentum, what would you find? Well, the 
statement is that in general you would get another out- 
come p; # po. How vague can a theory be? Well, in 
a sense that’s precisely what quantum theory is about, it 
tells you exactly how vague outcomes of measurements 
are. 


Certain uncertainties. Probabilities imply uncertainties 
in outcome, but the magnitude of those uncertainties are 
precisely determined. We have to deal with ‘certain uncer- 
tainties’ so to speak. In fact there are strong bounds on the 
uncertainties of different observable quantities. You might 
for example try to circumvent the quantum uncertainties by 
being smart. If you say, | measure the position of a particle 
so that it is well localized in position space, and then im- 
mediately after | measure the momentum so that | can also 
localize the particle in momentum space. By doing this, am 
| not arbitrarily close to the state in classical physics where 
we could assign a precise position and momentum to a 
particle at any instant? The stupefying answer is: certainly 
not! 


The Heisenberg uncertainty principle 


The quantessential message on the differences between 
classical and quantum observables is very clearly, con- 
cisely and quantitatively encoded in what are called the 
Heisenberg uncertainty relations. For the case at hand he 
derived that for any state of a particle the following relation 
holds for the uncertainties in position Ax and momentum 
Ap of the particle in that state: 


h 
Ax AP È 7s 


where the spread is just the width of the respective prob- 
ability distributions. It relates measurement outcomes for 
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the same state in different frameworks! What Heisenberg 
proved was exactly that there is a lower bound on the prod- 
uct of those widths. It shows unequivocally that the situa- 
tion, generally assumed in classical physics, where both 
widths can be taken to zero in principle (assuming ideal 
measurement apparatus etc.) is not possible in quantum 
theory as a matter of principle. 


If we drop a marble in a bowl, it will after some oscilla- 
tions settle down in the minimal energy state which means 
that it will be at rest at the bottom of the bowl. Momentum 
zero and position fixed exactly: no uncertainties. Clas- 
sically yes, but because of the uncertainty relations, or 
the particle-wave duality for that matter, this cannot be the 
quantum story. A quantum marble cannot settle down in a 
state where it is at rest at the bottom of the quantum bowl, 
because then its position and momentum would be exactly 
known, there would be no uncertainty, and that is not an 
allowed state. The lowest energy state of the quantum 
marble in a quantum bowl turns out to be one where the 
uncertainties in position and momentum are about equal 
and saturate the lower bound of the uncertainty relation. It 
gets as close to the classical ideal as possible you could 
say, but the truth is that the lowest energy state of the par- 
ticle does not specify where it exactly is nor what its mo- 
mentum precisely is. 


As we will see later, there exist Heisenberg uncertainty re- 
lations between any pair of observables A and B, only ifa 
non-trivial (non-zero) bound only occurs for an incompati- 
ble (non-commuting) pair. What does this have to do with 
my expose about frameworks? Surprisingly little in fact. 
The uncertainty relations link the variance in outcomes of 
measurements of a pair of observables in any given state. 
So given a state |p) of a particle, one can imagine making 
many independent measurements of say the position x of 
the particle in that state. This of course does not mean 
that you make a simple sequence of measurements on a 
single particle, because a measurement will change, what 
do | say, will collapse the state! So you have to prepare 


p 
Ax Ap 2 ħ/2 
Ap 
> X 
ee 
Ax 


Figure 11.2.18: Heisenberg’s uncertainty relation. The uncer- 
tainty relations for position and momentum define the minimal 
area in classical phase space corresponding to possible states 
with uncertainties Ax and Ap. 


‘identical’ particles in identical states and then make re- 
peated measurements of the observables in question. You 
may start with position to obtain an average or expecta- 
tion value x and some variance Ax. Subsequently, one 
could make independent momentum measurements pro- 
ducing a distribution of outcomes with an average p and 
variance Ap . Heisenberg’s fundamental relation says that 
the product of these variances or ‘uncertainties’ is larger 
than or equal to h/2 = h/47. So we do not compare in- 
dividual measurement outcomes but distributions thereof. 
In Figure II.2.18 we show that the product of uncertainties 
in a given state corresponds to a certain rectangular area 
in the (classical) phase space, the shape of the rectangle 
depends on the state but its area has to be larger than the 
minimal area indicated in the figure. The conclusion there- 
fore is that in the quantum world there can be no states 
in which both position and momentum take on precise val- 
ues! It is a profound statement concerning probabilities of 
measurement outcomes of different variables in any given 
state, but that in itself has no bearing on the logical struc- 
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Figure 1.2.19: Time-frequency duality. Representation of a 
sound signal as a periodic function of pressure (in red) in time: 
P(t) = coswt, or as a function of frequency P(w) with two 
narrow peaks around w/27m = fọ. A periodic signal is not 
localized in time with At — oo , but is very localized in frequency, 
Af 0. 


ture of the quantum world we were discussing in the previ- 
ous section, though it is of course consistent with it. 


Non-trivial uncertainty relations exist for all pairs of incom- 
patable or non-commuting observables, because these can- 
not be measured simultaneously, or stated more precisely: 
if the system has a definite value for the one variable, it 
is not possible to assign a value for the other. One can 
choose either one to quantify or describe any state of the 
system but not both. We conclude that quantum states are 
thus described by a maximal number of mutually compat- 
ible observables that define a framework. And indeed not 
all choices of sets of compatible observables are equally 
convenient or practical, that depends on what you want to 
know about the system. 


f=> 


Figure 1|.2.20: Time-frequency duality. The ‘clap in hands’ sig- 
nal is very much localized in time, At — 0 and spread out very 
widely in the frequency domain, Af — oo. 


A sound analogy 


In this subsection we take one further step trying to under- 
stand what incompatible observables, and the uncertainty 
relations they obey, mean. Surprisingly enough, there are 
uncertainty relation look-alikes in the classical physics of 
waves that may take some of the mystery away. Let us 
for example think about sound. Sound is a pressure wave 
that passes. At some point in space we hear a sound sig- 
nal and ask how we would characterize it. One way is to 
plot the pressure variations in real time, and another way 
is to represent the signal in the frequency domain as a su- 
perposition of sounds of different frequencies with different 
amplitudes. These pictures would look quite different but 
contain the same information and are just different repre- 
sentations of the same signal. 


Let us first look at (or listen to) a pure tone like the ‘a’. 
A truly pure ‘a’ of 441 hertz is represented in time by a 
pure sine or cosine wave of a fixed wavelength which has 
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that single frequency of 441 Hz. But for a cosine to be 
pure it has to last a very, very long time (compared to the 
inverse frequency), as | indicated with the red curve in Fig- 
ure II.2.19. So, a pure tone is very much extended in the 
time domain, but if you look in the frequency domain it is 
extremely narrow because the signal has only a single fre- 
quency (in fact f = +fo) as you see in the narrow peaked 
blue curve in the same figure. Now, in Figure ll.2.20, the 
opposite happens when | clap loudly my hands once, or 
shoot a gun, then the signal is extremely short in the time 
domain, but in the frequency domain it is very wide.” If | 
clap my hands or bang a hammer on the table and | ask 
you what the pitch was of the sound you heard, you will 
answer that you could not determine any pitch because 
the sound lasted for too short a time. If you were to fire 
a revolver next to a piano and keep the right pedal down 
then all the strings will resonate showing that basically all 
the frequencies were present in the sound of the shot: an 
overdose of pitch rather than no pitch. The upshot of this 
exercise is that indeed duration and frequency are dual to 
each other. The more accurate the frequency (i.e. the 
smaller Af) in a signal, the longer it has to last (i.e. the 
larger At) and vice versa. In other words one expects a 
relation like AfAt > constant to hold. This is true and 
by the way the constant is 1/471. The lesson here is that 
you can’t have it all: you cannot have the cake and eat it. 
The physics in this example is quite comprehensible and 
much what we experience in daily life, yet we encounter a 
situation where we cannot ask for a signal that is precisely 
localized in time and also has a well-defined pitch. These 
two physical quantities are in that sense incompatible, and 
this duality is intimately linked to the wave character of the 
phenomenon. 


Let us switch now to electromagnetic waves which are 


’The two figures are not entirely symmetric because | choose to clap 
at time t = 0, the exactly dual situation would be obtained by choos- 
ing w = 0 in the first figure then the cosine function would become 
constant, cos0 = 1, and the two peaks move on top of each other as 
fo =0. 


made up of many photons. Remember that photons obey 
the Planck-Einstein relation E = hv, so we can replace 
the frequency v by the energy and obtain an energy-time 
relation AEAt > h/2, and that is indeed exactly an in- 
stance of Heisenberg’s uncertainty relations. The interpre- 
tation is that we cannot measure both variables with arbi- 
trary precision simultaneously. 


Heisenberg’s derivation J 


With the formal ingredients we have so far introduced it 
turns out to be rather straightforward to actually derive the 
uncertainty relation for two observables. It really is a mat- 
ter of simple algebra but with objects that look awesome. 
You feel like you are juggling with antique Chinese vases 
but in fact they are just empty plastic bottles. 


Let us consider two observables A and B, in particular we 
study two vectors (A—a)|t») and (B—b)|\p) where a = (A) 
and b = (B) are real numbers. The variance (the mean 
square deviation) of an operator A in a state |p) is defined 
in terms of expectation values as (see the Math Excursion 
on Probability and statistics in Volume Ill): 


(AA)? =< (A—a)? >=< A?—2aA +a? >=< A? >a’. 


The variance is a measure for the width of the distribution. 
Note that if hp) is an eigenstate of A , meaning that Alp) = 
alp) , then AA = 0. Now there is a famous inequality for 
vectors called the Schwarz inequality. It says that if you 
have two vectors and their inner product, then the prod- 
uct of their lengths squared is always larger or equal than 
their inner product squared. In the familiar Euclidean set- 
ting we would have |v - wi? = |v/7|w/? cos20 < |v/2|w/?, 
which holds because the cosine squared is smaller than 
one. Applied to our vectors above this yields the statement 
that 


(IA — al’) (IB — bi?) > (A —a)(B—b))/?. 
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Note that on the right-hand side ((A — a)(B — b)) is just 
some complex number, let us call this number z. Then the 
absolute value squared is 


Iz? = z*z = (Re z}? + (Imz)’, 


and clearly |z|? > (Zmz)? , where 


(Imz) = 3-2-2") = $ (A,B). 


The commutator is the only term that survives because 
z* = ((A—a)(B—b))* = ((B— b)(A — a)) and all other 
terms cancel out. 


Putting the results of the above equations together, we ar- 
rive at the desired result, the celebrated Heisenberg’s un- 
certainty relation in its general form: 

AA AB > 1 <ilA,B] >|. (11.2.20) 
Note that if A and B are hermitian then also i[A, B] is, 
which makes its expectation value real. We obtain a non- 
zero lower bound for the product of uncertainties in the 
case the operators A and B do not commute. An imme- 
diate consequence of the relation is that in any state the 
uncertainty in the measurement value for two such incom- 
patible variables can never be zero for both. There is a 
complementarity: the more precise you know observable 
A the less precise you know the value B . It is the golden 
rule for giving and taking: you can’t have it all. E 


Qubit uncertainties 


After this derivation of the precise form (11.2.20) of the un- 
certainty relations it is interesting to see how these rela- 
tions play out for the simple case of qubits. 


We are going to check the qubit uncertainties in the cases 
we considered before. If we take as two incompatible ob- 
servables A = Z and B = X, then the relation would 


|1> ® — |1> A,=1 


Figure II.2.21: Spin uncertainties. Uncertainty in spin mea- 
surements of Z and X denoted by © and & respectively, for the 
states |1) and |+) respectively. The blue numbers are the prob- 
abilities for the various outcomes. We see that where one of the 
spin measurements has minimal uncertainty (A = 0), the other 
is maximal (A = 1). Had we chosen an eigenstate |r) of Y then 
the uncertainty in both X and Z would have been maximal, and 
the uncertainty relation would again be satisfied. 


read 


AZ AX > 5 (Z, XI) = oY). (1.2.21) 


Let us then choose for the states hp) subsequently (i) |1) , 
(ii) |+), and (iii) the eigenstate of Y with eigenvalue +1, 
denoted by |r) . We recall that Z? = X? = 1 and also 
that |( A | equals either 1 or 0 for our A depending on 
whether hp) is an eigenstate of A or not. This makes the 
calculation relatively simple for example for the left-hand 
side we obtain: 
(AAP = (A?) -= (A)? 
1 — 1? = 0 (if eigenstate) 
{ 1—0? = 1 (if not eigenstate) 


and for the right-hand side: 


y 


(if eigenstate) 
(if not eigenstate) 


Kenly w= { o 
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So, for the subsequent cases we end up with the follow- 
ing inequalities (i) 0- 1 > 0, case (ii) 1-0 > 0 and case 
(iii) 1- 1 > 1, and we happily agree that in all cases the 
uncertainty relation is satisfied and moreover saturates the 
lower bound. In Figurell.2.21 we give the various mea- 
surement outcomes with their probabilities for the Z and X 
observables for the three states |1) , |+) and |r) . 


Gs @) Ground state energy. For a quantum 
particle the lowest energy state will, 
even if it is weakly localized, always 
have some extra zero point energy associated with 
it. Adding up all the zero point energies of all parti- 
cles means that what we call the ‘vacuum’ must be 
full of energy. Can’t we get it out and do something 
useful with it is a question that regularly comes up. 
No presumably not. All physical observables like 
spectral lines and so on are related with energy dif- 
ferences, and you are free to choose the ground 
state level as it has no observable effect. 

Having said that, you could of course scratch your 
head, and modestly point out to me that there is a 
notable exception, and that is Einstein’s theory of 
general relativity, where the vacuum energy does 
indeed cause physical effects, even of cosmic im- 
portance. The shocking news has been that indeed 
the energy balance in our universe is dominated by 
the vacuum contribution, which amounts to some 
70 percent. But it remains a complete mystery why 
that number is what it is. Yet, this vacuum energy 
is like a cosmological constant and it has a mind- 
blowing property that it anti-gravitates and exerts an 
outward gravitational pressure that makes the uni- 
verse expand, and will keep the universe expanding 
forever as we discussed briefly in Chapter 1.2. So, 
there are instances that much ado about nothing 
is quite OK, especially if one understands nothing 
about that nothing. O 


From these the variances on the left-hand side of (11.2.21) 
can immediately be read. 


Let me make a final comment. Let us go back to the dis- 
cussion of ‘bit dynamics’ at the beginning of Chapter II.1. 
There we stated that Z could be interpreted as a ‘position’ 
operator giving the +1 eigenvalue for the spin-up (down) 
state. In that context the X operator ‘generated’ transla- 
tions (hopping in z) and as such acted like a ‘momentum’ 
operator. And once more we see that the two operators 
do not commute and hence satisfy non-trivial uncertainty 
relations. By the way, these uncertainties imply that quan- 
tum computers will provide an array of potential answers, 
from which the correct one has to be selected somehow. 


The breakdown of classical determinism 


The uncertainty relations imply strict limits on the predicta- 
bility in physics. This unpredictability implies the break- 
down of classical determinism. A surprising and profound 
philosophical sacrifice in the realm of our material universe. 


The uncertainty relations of quantum theory go further: 
they imply that if we know the particle has a small uncer- 
tainty in position because we just measured its position, 
then it is in a state where the uncertainty in momentum 
will be relatively large. If you were to ask me to tell you 
where the particle would be some time after, then it would 
be hard to point at a specific point. | do know its starting 
position precisely, but | don’t know its momentum, and thus 
it is hard to say where it goes and with what momentum. 
We see that the quantum postulates, concisely expressed 
in the uncertainty relations, imply the breakdown of classi- 
cal predictability and determinism. This is one more truly 
quantessential feature of the underlying reality. 


Humankind’s limited abilities to observe have through our 
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common experiences precipitated in what we call deep in- 
tuitions about how the world works. And such intuitions 
tend to shape our judgements and expectations. One thing 
that has become inescapably clear is that quantum the- 
ory has shown such intuitions to be essentially mistaken 
in an essential way, a sobering thought indeed. That one 
more illustrates the power of the invisible. At this point | 
should remind you of the wonderful quote from the Feyn- 
man’s which | included in the preface to Volume | on page 
xiv. 


This fundamental indeterminacy in nature has lead to nu- 
merous speculations on the far-reaching consequences it 
might have, varying from metaphysical hocus-pocus like 
floating tables to explanations of the human free will. 


Why does classical physics exist anyway? 


After all this classical physics bashing, you might ask: how 
come classical physics is doing so extremely well in ordi- 
nary life, if it is so fundamentally wrong? How can that 
be? 


A golf ball. Let us consider a golf ball. If | neglect its in- 
ternal structure, should | not treat it as a quantum particle 
and if | do so just reproduce the classical answer? Yes, you 
better do so, otherwise quantum theory would be in conflict 
with direct observations. Suppose you would make an ex- 
tremely accurate measurement and measure its momen- 
tum in four decimal places so Ap = 10-4 kg ms™! , then 
substituting this into the uncertainty relation you would find 
that the uncertainty in position would be a mesmerizingly 
tiny Ax > h/4nAp ~ .5 x 10-°°m. But wait, that is the 
realm where string theorists wander. You will agree that 
nobody is ever going to make a measurement of position 
with such 30-decimal places accuracy, let alone of a golf 
ball! Think of an ultimate machine like the Large Hadron 
Collider at CERN, where physicists are able to localize par- 


ticles ‘only’ up to about 10718 meters at present. Physicists 
may have their ways, but to verify the uncertainty relations 
by playing golf in the LHC is not of them. So, what then 
saves the day for classical physics or if you prefer, what 
saves quantum physics? That is the dazzling smallness 
of Planck’s constant if you express it in our anthropocen- 
tric system of units, made up of meters, seconds and kilo- 
grams. That is why the basic need for quantum theory, i.e. 
the failure of classical theory manifests itself at first only 
on small scales, and it is also for that reason that it took so 
long for the quantum world to be discovered. 


An electron. To appreciate the point just made, let us 
replace the golf ball by an electron with a mass of about 
10-°° kg. Then we could easily measure its momentum 
with an uncertainty of 10-°° kg m s7!, leaving a posi- 
tion uncertainty of about one tenth of a millimeter. So, 
indeed in an atom with a typical size of 1071? m — one- 
tenth of a nanometer — this uncertainty matters and there- 
fore we should treat the electron quantum mechanically. 
This observation by the way implies that we should no 
longer think of electrons as well-localized particles orbiting 
the nucleus. Indeed the way the atom is usually depicted 
(see Figure 1.3.6) is a severe misrepresentation inherited 
from our classical intuition. Rather we should represent 
the electron as a standing wave pattern of the probability 
wave in the tiny volume of atomic size. Atoms are not like 
tiny solar systems, but rather like tiny quantum bongos! 

In fact knowing the size of the atom to be about Ax ~ 
10-'© m one may use the uncertainty relations to esti- 
mate the minimal momentum as p ~ Ap = h/(471Ax) ~ 
10**kgm/s , which corresponds to an electron energy of 
10-!’Joule ~ 1 eV. And 1 electron Volt is indeed the or- 
der of magnitude of atomic energy levels. It can’t be much 
less and you could even say that this is one of the reasons 
that matter is actually stable. 


The emergence of classical physics. The macroscopic 
world which obeys by definition the classical laws of phys- 
ics is a world consisting of emergent phenomena, and the 
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classical laws are therefore only approximately true. The 
world we perceive is an incredibly coarse-grained version 
of a well-shielded microscopic reality. Our world has an 
incredible amount of entropy exactly because there is so 
much information hidden within, and science is exactly the 
systematic uncovering of that information and making it 
accessible. It is a gigantic hacking operation, a gigan- 
tic striptease of mother nature in which she slowly con- 
fides to us her deepest secrets. There are many why- 
questions one may ask on the macroscopic level that can 
be answered only after they have been turned into how- 
questions on the underlying quantum level. In other words, 
classical physics is the emergent macroscopic manifesta- 
tion of an underlying quantum world. The quantessence 
comprises of the unescapable laws underlying classical 
reality. This exemplifies the profound gain of progressing 
insight in the long run. The process of scientific progress 
is seldomly gradual and smooth, and rather proceeds un- 
predictably, with sudden shocks. In evolutionary biology 
Jay Gould introduced the notion of punctuated equilibrium, 
which clearly echoes in the picture of long periods of ‘nor- 
mal’ science, broken up by scientific revolutions, radical 
turning points in our thinking leading to paradigm shifts, as 
described by Thomas Kuhn in his book on Scientific Revo- 
lutions. | may add that important novel cultural dimensions 
have opened up, as a result of this process of progressing 
insight in science as | have argued in my book /n praise of 
science. 


Further reading on quantum measurement: 


— Quantum Theory 
D. Bohm 
Dover Publications Inc (1989) 


— Quantum Measurement Theory and its Applica- 
tions 
Kurt Jacobs 
Cambridge University Press (2017) 
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Table |I.2.2: Key quantum principles introduced in this chapter on observables. 


Keyword 


Description 


(ii) 


(iii) 


(vii) 


(viii) 


Observables 


Eigenvalues 


Eigenvectors 


Preferred frames 


Superposition 


(In)compatibility 


Frameworks 


Measurement outcomes 


Projective measurement 


A physical variable a or observable is represented by a hermitian operator or ma- 
trix A. To the system as a whole corresponds a set (algebra) of observables O = 
VAs |B cole 


The observable A has a set of real eigenvalues {a;} which make up the sample space 
or spectrum Sa of possible measurement outcomes for A . 


To each eigenvalue a; corresponds an eigenvector |a;) , or a subspace V€ . 


In the non-degenerate case, the eigenvalues of A are all different, their number equals 
the dimension of the Hilbert space, and the set of normalized eigenvectors {|a;) } forms 
an orthonormal basis for H . 


Any state |p) has a linear expansion in the basis of any framework. |tp) = } ; Bilai). 


Observables are compatible if (and only if) they mutually commute so that common 
eigenvectors can be chosen. Observables that do not commute are by definition 
incompatible. 


A maximal number of independent compatible observables forms a framework F . A 
complete orthonormal set of joint eigenvectors of a framework forms a basis for the 
Hilbert space # . 


When making a measurement of an observable A on a state |W) there is a probability 
pk = |lakhp)|? of obtaining the result ax . 


Upon measuring the value a; in a strong or projective measurement of A, the state |) 
‘collapses’ to the eigenstate |a;) of A. This statement is referred to as the projection 
postulate of Von Neumann. 


Chapter II.3 


Interference 


We have seen that a quantum particle like an electron has 
wave-like features and that an electromagnetic wave has 
particle-like properties as we may consider such a wave as 
a collective of photons. This naturally raises the question 
how quantum particles really exhibit these wave-like prop- 
erties. In this chapter we focus on the question of whether 
particles can show interference effects like waves do. The 
answer to this question is affirmative, as is demonstrated 
by the famous double slit experiments of various kinds. 
In this chapter we consider classical as well as quantum 
wave phenomena. 


Classical wave theory and optics 


Classical geometric optics treats light as straight rays that 
can be deflected or reflected by different media. The strict 
geometrical picture consisting of straight light rays can be 
augmented by the wave-type constructions based on Huy- 
gens’ principle, which states that any point on a wavefront 
can be considered as a source of secondary spherical 
waves. It is not only the laws and patterns of geomet- 
ric optics like reflection and refraction (breaking) of light 
at interfaces between different media that can then be ex- 
plained, but also more subtle effects like diffraction (bend- 


ing). 


Figure II.3.1: Dew drop. In this lithograph of M.C. Escher, the 
reflection of light causes the image of the windows of the ob- 
servers room. The refraction or breaking of light at water-air 
interface yields the enlarged image of the underlying veins of 
the leaf. (© 2023 The M.C. Escher Company.) 


Basics of wave theory 


Characteristics of waves. Let us recall some basics of 
classical wave theory. A propagating wave can in general 
be characterized by: 

(i) a periodic wave pattern of subsequent maxima and min- 
ima. The height of the maxima is called the amplitude of 
the wave. The curves connecting adjacent maxima are 
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Figure 1|.3.2: Wave patterns and propagation. We show the 
wave pattern corresponding to propagating wavefronts from a 
single point-like source, to illustrate the some basic wave con- 
cepts. 


called wavefronts; for the case of plane waves these are 
parallel straight lines or planes, while for a single source 
these are circular or spherical as illustrated in Figure II.3.2. 
(ii) a pattern of rays, which are lines perpendicular to the 
wave fronts. So from a point source the rays are straight 
lines pointing radially out. 
(iii) a wavelength A , which is defined as the distance be- 
tween two subsequent wavefronts measured along a ray. 
Often one uses the wavenumber k defined by k = 271/A, 
instead of the wavelength. 
(iv) a speed v , which is the speed at which the wavefronts 
propagate. For light and other electromagnetic forms of ra- 
diation propagating in vacuum, this is the universal speed 
of light c. In physical media (like glass) with electrody- 
namic properties different from the vacuum, however, the 
velocity of light will be less than its universal value in vac- 
uum. The speed of light in media may generally depend 
on the wavelength (or frequency). 
(v) a frequency f refers to the frequency by which every 
point in the wave oscillates. 


(vi) we distinguish /ongitudinal and transversal waves where 
the medium oscillates parallel to the direction of propaga- 
tion (sound), or orthogonal (light). 

(vii) a polarization. Transversal waves can be (linearly) 
polarized, meaning that there is a single orthogonal axis 
along which the field oscillates. 


Typical sizes and scales. For water waves the wave- 

length may vary from micrometers to many miles. For 
sound audible by the ear, in air at room temperature, the 
frequency f varies from 20 Hz to 20.000 Hz; and with the 
sound velocity v = 343 m/s, the wavelength would vary 
between 1.7 cm and 17 m. For visible light the typical wave- 
length is thousands of angstréms (~ 1077m). It is eas- 

ier to remember for microwaves, because the wavelength 

you correctly guess to be of the order of micrometers. For 
quantum particle waves the scale is set by the De Broglie 
wavelength A = h/p, typically about 10-'° meters or 1 

angstr6ém. 


Fundamental wave relations. There is a fundamental re- 
lation between the velocity, frequency and wavelength of a 
wave given by v = Af. Mostly when talking about waves 
one assumes these are described by a linear theory. In 
such situations the linear superposition principle holds, so 
to understand the wave phenomena caused by indepen- 
dent sources one can simply add the wave patterns pro- 
duced by the sources individually. On the one hand this 
applies in general to wave phenomena, as long as the os- 
cillations are small because then the linear approximation 
holds well, but on the other hand we know that the Maxwell 
equations describing the electromagnetic waves are linear, 
and so are the Schrödinger and Dirac equations. 


The physics of waves in a medium is interesting, because 
a wave carries a certain amount of energy and momentum. 
This, however, does not imply that matter somehow moves 
along with the wave. Think of a wave of water, you drop a 
stone in the pond which excites the water surface, locally 
perturbing the equilibrium situation. It is the deformation 
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Figure 11.3.3: Dispersion of a wavepacket. Depicted is a Gaus- 
sian shaped wavepacket in x—space at t = 0 and after five 
equal time intervals. The packet disperses (broadens) in time 
because different components travel at a different speed. 


energy of the (elastic) medium that causes the perturba- 
tion (along with some characteristic derformation energy 
density) to spread as a wave pattern. As the total energy 
of the perturbation is conserved (if we assume assume 
that there is no dissipation), the amplitude of the circular 
wave has to decrease in time because the circumference 
of the wavefront increases. Anyway, for this transversal 
wave the position of a water molecule stays fixed as it only 
oscillates up and down. In the case of sound, the air mol- 
ecules swing forth and back, but also in that case there is 
no material streaming along with the wave. 


With light waves the situation is different though, because 
the lightwave is made up of photons, all moving with the 
same speed of light. The classical wave does not corre- 
spond to a single photon, rather it is a strange coherent 
superposition of different states with a different number of 
photons in them. They may all have the same frequency, 
but the various terms can involve quite arbitrary phases. 
As a matter of fact what this means is that the number of 


Figure II.3.4: Group and phase velocity. We have depicted a 
superposition of two linear waves with different frequencies and 
momenta. The combined result shows an enveloping wave in 
blue moving with the group velocity vg; , and the actual super- 
position in red moving faster with phase velocity Vpn . 


photons corresponding to a ‘classical’ wave is really not 
defined. This is not meant in a statistical sense but in a 
more fundamental way. To speak in the spirit of the pre- 
vious chapter, their number is not defined, or better in- 
definite, because the corresponding ‘number operator’ is 
incompatible (does not commute) with the quantum oper- 
ator that creates the classical wave configuration from the 
vacuum. In other words an electromagnetic wave is not in 
an eigenstate of the photon-number operator. 


Dispersion. We have mentioned the fact that waves of 
different wavelength or frequency may travel at different 
speeds: this phenomenon is called dispersion. The most 
well known is the dispersion of light in glass for exam- 
ple, giving rise to separation of colors when light passes 
through a prism, as in Figure II.3.8. Dispersion means that 
the velocity and frequency depend on the wavelength or 
the wavenumber. It is usually specified by giving the func- 
tional relation between the angular frequency w = 2z7f 
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and the wavenumber k , so by specifying w = w(k). And 
we have seen that electromagnetic waves satisfy the lin- 
ear dispersion relation w = ck, while for the De Broglie 
matter waves we have a quadratic dispersion because E = 
p?/2m with p =hkand E =hw yields: w =hk?/2m. 


Broadening. The effect of dispersion manifests itself if 
we consider the time evolution of a wavepacket, which is 
just some linear superposition of components with differ- 
ent wavelengths. In Figure ll.3.3 we see an initial packet 
that has some shape which is spatially localized with a cer- 
tain width. One will find that such a packet will broaden or 
spread out (disperse) during its propagation, because the 
momentum components that make up the packet move at 
different speeds. 


Group velocity. The next question that comes to mind 
is what the velocity of this wave packet is. After all it is 
made up of different components that move with differ- 
ent velocities. The basic answer to this question is illus- 
trated in Figure 11.3.4, for the simple case where we have 
shown the linear superposition of two waves with differ- 
ent frequencies and wave numbers, the combination can 
be rewritten as a product of a difference and sum wave 
with frequencies w+ = (w 1 + w2)/2 and wave numbers 
k+ = (kı £kz)/2. What we obtain is that the actual super- 
position, which is the wave pattern in red, propagates ‘in- 
side’ the slowly moving enveloping wave in blue. You could 
say that the red wave with frequency w and wavenumber 
k has a frequency modulated by the blue wave with w_ 
and wavenumber k_ . The red wave moves with the phase 
velocity Vpn = w+/k+ , whereas the envelope moves with 
the group velocity vgr = w_/k_. 


Dissipation. Dissipation refers to the loss of energy of 
a system, for example to the environment, or by produc- 
ing heat internally due to friction. For waves, dissipation is 
often caused by inelasticity (viscosity) of the medium. Dis- 
sipation causes the signal to die out. Note that dispersion 
is not a dissipative phenomenon; it just is a consequence 


Figure 11.3.5: Three views. This picture offers three perspec- 
tives on yours truly, from a direct, to the point, a reflective and a 
refractive point of view. This can be achieved by just looking at 
a glass of wine! 


of the fact that different components of the wave packet 
move with different velocities. 


Reflection, transmission, breaking and diffraction 


Huygens’ principle. To find out how the wavefront of a 
propagating wave moves forward, one may consider every 
point on the front as a source from which secondary waves 
emanate. The envelope of the secondary wavefronts de- 
fines the new wavefront. This is illustrated in the top right 
picture of Figure II.3.2. Huygens’ principle is a powerful 
tool to explain all kinds of generic wave phenomena, like 
reflection, refraction, diffraction and interference. A nice 
example of reflection and refraction on which the working 
of lenses is based is provided by M.C. Escher’s lithograph 
The dew droplet in Figure 11.3.1. 


Reflection. Light can be reflected off a surface, like in the 
reflection of an ordinary mirror. The law of reflection in 
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Figure I|.3.6: Reflection and refraction. The picture illustrates 
reflection and refraction at an interface between two media. 
From the vacuum to any medium the angle with the normal to 
the interface of the refracted ray r is smaller than the angle i of 
the incoming beam. 


geometric optics reads simply: 
b— ts 


or in words the angle i of the incoming beam (with the 
normal on the surface) is equal the angle t of the reflected 
beam. This is illustrated in fig 1.3.6. 


Refraction or breaking. The law for breaking of light at an 

interface between two media with relative breaking index 

n is given by Snellius’ law which is also illustrated in the 

same figure: o. 
sini _ 
sit” 

where n is given by the ratio of the speed of lights in medium 


1 and medium 2: 
c1(Ff) 


~ elf) 
The proof of both laws can be given using Huygens’ princi- 
ple as we depicted in Figure II.3.7. We use the principle at 


Figure 11.3.7: Huygens’ principle. The construction for the 
reflected and refracted beams (thin lines) and wavefronts (fat 
lines) using Huygens’ principle, assuming you know the ratio of 
velocities in the two media, or breaking index. 


the points where the incoming rays hit the layer between 
the two media, where the new front can be constructed 
using the same radii in the same medium (reflection), or 
reduced radii (because of the reduced speed of light) in 
the dense medium. 


Note that whereas in vacuum the velocity of light is uni- 
versal and therefore does not depend on the frequency or 
wavelength (color), this is no longer true in other media. 
As a consequence the angle of refraction will be different 
for different colors, as was so beautifully demonstrated by 
Newton by letting a sun ray pass through a prism (see Fig- 
ure II.3.8). 


Bragg diffraction and reflection. 


William Henry Bragg and his son Lawrence Bragg pro- 
posed in 1913 a nice explanation of the reflection lines ob- 
served in X-rays of crystals. The key idea of their model 
was that X-rays would scatter of the individual atoms in 
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Figure 11.3.8: Color decomposition of white light through a 
prism. The refraction of white light by passing through a prism. 
The propagation speed of light of different colors (frequencies) 
is different in glass and leads to different amounts of refraction. 


subsequent layers of the crystal. The layers in a crystal 
are equally spaced with a distance d, a distance that is 
typically about 1071? m. Requiring radiation with a wave- 
length comparable to d yields that we need high frequency 
X-rays indeed. The question then was to derive the condi- 
tion for constructive interference of incident and reflected 
waves. Assuming a monochromatic wave incident under 
an angle O with the surface of the crystal, the condition fol- 
lows from Huygens’ principle, as is schematically depicted 
in Figure 11.3.9. The path length between the two rays scat- 
tering from the two top layers should be proportional to an 
integer m times the wavelength to obtain constructive in- 
terference. The integer m is called the diffraction order. 
This leads to the Bragg formula: 


2dsin 8 =nma. 


This formula is general as long as the particles in the beam 
are scattered in a spherical fashion from each individual 
atom in the lattice. In that sense the formula can also be 
applied to matter waves, in other words, to the scattering 


aN 


Figure II.3.9: Bragg reflection. The crystal consists of equally 
spaced layers. Two rays from the top two layers are drawn, the 
path difference between the incoming and outgoing wavefronts 
of the two paths equals 2dsin 98 , this should equal an integer 
times the wavelength A. 


of electrons or neutrons from crystal surfaces. By look- 
ing at different plane orientations this principle turns into 
a powerful technique to determine the spatial structure of 
crystals. 


Beamsplitters and polarization 


In classical optics it was Newton who in his Opticks, pub- 
lished in 1704, introduced the prism to split a beam of light 
into its different light components (see Figure 11.3.8), while 
Huygens in his monumental Traité de la lumière, published 
in 1690, emphasized the importance of double breaking by 
‘Icelandic crystal’ or calcite, and explained it to a certain 
extent with his wave theory of light. 


These explanations were all based on the idea that differ- 
ent components of ‘ordinary’ light have different velocities 
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Figure II.3.10: Icelandic crystal. Double refraction of light by 
an Icelandic crystal or calcite. 


in various media, and therefore have a different amount 
of refraction at interfaces between various media. And 
this is indeed a fundamental ingredient of all beam split- 
ting devices. We should be aware that in the early the- 
ories of light that arose in the Enlightenment era through 
the works of Descartes and later of Huygens and Newton, 
many properties of light were discovered and these led to 
the great dispute between the latter two about the parti- 
cle versus wave-like nature of light. The property of po- 
larization was not really discussed, and understanding the 
transversal wave nature of light had to wait untill Maxwell 
identified light as an electromagnetic waves two centuries 
later. 


However it is remarkable to see how tantalizing close Huy- 
gens came to discovering the nature of polarization exactly 
because of his particular emphasis on the phenomenon of 
bi or double refraction exhibited by light passing through an 
Icelandic crystal, which we have depicted in Figure 11.3.10. 
This phenomenon occurs basically in all transparent aniso- 
tropic media. In his treatise he remarks: 


Before finishing the treatise on this Crystal, | will 
add one more marvellous phenomenon which | dis- 
covered after having written all the foregoing. For 
though I have not been able till now to find its cause, 
| do not for that reason wish to desist from describ- 
ing it, in order to give opportunity to others to in- 
vestigate it. It seems that it will be necessary to 
make still further suppositions besides those which 
| have made; but these will not for all that cease to 
keep their probability after having been confirmed 
by so many tests. 


He then goes on to describe how he studied the proper- 
ties of light subsequently passing through two crystals and 
makes the observation that the double refraction does not 
take place at the second crystal, as is clear from his il- 
lustration (see Figure II.3.11). He even goes as far as to 
observe that the properties of the second refraction de- 
pends on the orientation of the crystal. And his humble 
conclusion reads: 


It seems that one is obliged to conclude that the 
waves of light, after having passed through the first 
crystal, acquire a certain form or disposition in virtue 
of which, when meeting the texture of the second 
crystal, in certain positions, they can move the two 
different kinds of matter which serve for the two 
species of refraction; and when meeting the sec- 
ond crystal in another position are able to move 
only one of these kinds of matter. But to tell how 
this occurs, | have hitherto found nothing which sat- 
isfies me. 


In the following we discuss various cases of how the split- 
ting of a beam, dependent on the polarization state of the 
particles can be achieved. First we discuss some beam 
splitters for photons. Next we discuss the case of spin 
one half particles like electrons, protons and neutrons in a 
magnetic polarization device like the Stern—Gerlach setup. 
We also introduce some other devices from which more 
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Figure II.3.11: ‘A marvelous phenomenon.’ Double refraction 
of light does not occur in the second crystal. Illustration taken 
from Huygens’ Treatise on light. 


elaborate interference experiments can be assembled. To- 
gether, they form part of the toolkit for many famous ex- 
periments that demonstrated how different quantum the- 
ory really is, where particles can interfere with themselves, 
or where certain forms of non-locality (which are strictly 
forbidden in the classical realm) pertaining to entangled 
states of particles can be unambiguously demonstrated. 
This will be our focus in the remainder of this chapter. 


Photon polarization: optical beamsplitters 


In modern (quantum) optics using monochromatic lasers, 
many quite stunning experiments have been performed, 
demonstrating the paradoxical but quantessential features 
of light and in particular its polarization. In the previous 
chapter we have already discussed various filters: polar- 
izers on page 293, and wave plates on page 290 through 
which the polarization states can be selected and/or ma- 


Figure II.3.12: A half mirror. A half-silvered mirror reflects half 
the number of photons in a beam, the other half is transmitted. It 
is a beam splitter (BS) that is insensitive to the polarization state 
of the incoming photons. 


nipulated. Now we extend the toolset with some beam- 
splitters much in analogy with the Icelandic crystal. These 
devices play a crucial role in experiments where proper- 
ties like particle interference and entanglement can be put 
to the test. 


Clearly, by splitting a beam one obtains two beams which 
are strictly in phase and therefore offer interesting experi- 
mental possibilities. 

A first splitting device would be the half-mirror, where half 
the number of photons in the beam gets reflected while the 
other half gets transmitted. As such this mirror is insensi- 
tive to the polarization state of the photons, as we have 
indicated in Figure 1.3.12. 

It is also possible to coat the interface with particular chem- 
icals in which case we obtain a polarizing beam splitter 
as depicted in Figure II.3.13; if the incoming beam is un- 
polarized, the reflected photons are horizontally polarized, 
while the transmitted ones are vertically polarized. We ob- 
viously can rotate the polarizing cube around the incoming 
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Figure 11.3.13: A polarizing beam splitter (PBS). This compo- 
nent is sensitive to the polarization state of the photons: the 
reflected ones are horizontally polarized, the transmitted ones 
vertically. 


photon momentum vector, generating a split between two 
other linear polarizations. This device acts much like the 
anisotropic crystals causing double diffraction like the ones 
Huygens mentioned. It is also similar to the Stern—Gerlach 
device to split a beam of spin-1/2 particles to which we 
turn shortly. 


A final device we want to mention is what is called a para- 
metric down converter. It is a nonlinear crystal that splits 
an incoming monochromatic beam of a given frequency f; 
it splits a fraction of the incoming photons into two pho- 
tons with half the frequency (or energy). These secondary 
photons leave the crystal under a small angle with the in- 
coming beam as we have indicated in Figure 1I.3.14. As 
we will discuss later, the remarkable property of these sec- 
ondary pairs is that their polarization states are entangled. 
Depending on the type of crystal this maybe parallel or or- 
thogonal entanglement, where one speaks of type | or type 
Il down conversion. 


v/2 


Figure 11.3.14: Two photons out of one. A parametric down 
converter (PDC) is a nonlinear crystal where incoming photons 
may be converted down to two photons with half the energy or 
frequency. These secondary beams leave the crystal under a 
small angle with the primary beam. The polarizations of the 
secondary pair are entangled and can be chosen to be either 
parallel or orthogonal. 


Spin polarization: the Stern-Gerlach device 


We have illustrated this means of polarizing the spin for 
various choices of the state |p) and observable A being 
the spin polarization, in the Figures I|.2.9. Let us comment 
on their content. The green circle is the space of nor- 
malized quantum states; normally this would be a three- 
sphere but we have chosen the section where the coeffi- 
cients « and B are real, so we are left with an ordinary cir- 
cle in R2 . We consider two real observables being Z and X 
and combinations thereof, those have always eigenstates 
that are lying on the circle. In the diagrams in the figure we 
see pairs of blue axes. These axes are in the direction of 
the eigenvectors of A and labeled by their eigenvalue. The 
blue axes together represent thus the measurement frame 
corresponding to A . Now there are five things to observe: 
(i) the blue axes have a direction but are not oriented, ex- 
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(a) Stern—Gerlach experiment: Measurement of spin polarization along 
z-axis, of a state |») . Outcome can only be +1 or —1 in units h/2 with 
probabilities depending on the particular state |») . The measurement 
outcome affects the state of the outgoing particle. 
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(b) Measurement spin polarization (c) Measurement spin polarization 
along z-axis, of a state |\p) . along x-axis, of a state |p) . 


Figure 11.3.15: The Stern—Gerlach experiment. (a): Sending 
the electron beam through an inhomogeneous magnetic field 
will split the beam. (b) and (c): Symbolic representation of the Z 
and X polarizing beam splitters that we will use later on. 


pressing the fact that the opposite points +|) have the 
same probabilities. They are indistinguishable by mea- 
surement. In other words they only differ by a phase, which 
in this case a real phase, which can only be —1; 

(ii) perhaps it is also surprising that the frames correspond- 


ing to Z and X are not orthogonal, rather they only make 
an angle of 45°, half the expected angle. If we were to 
turn the polarizer in the minus z direction, thus rotating in 
real space the polarizer in the plane by 180° , would in- 
terchange the eigenvalues and consequently interchange 
the axes of the measurement frame, which is equivalent to 
rotating in state space by half the angle, in this case 90° . 
Saying it yet differently: we have chosen the up and down 
state vectors of the spin as orthogonal unit vectors. This 
means that if we rotate the device by ọ in ordinary x, y, z- 
space, then the polarization plane will only rotate by @/2 
degrees in spinor space, which in the Hilbert space for this 
system means a rotation by 90°. That explains why the 
choice of observable involves fixing two orthogonal axes 
in state space; it is really a choice of frame rather than se- 
lecting a particular direction. 

(iii) Once the measurement has been made, one axis of 
the frame has been singled out, and the wavefunction ‘col- 
lapses’ to a normalized state along that axis. If in the ex- 
ample of Figure II.2.9(d) above, we happen to measure the 
X eigenvalue +1, then the state collapses along the cor- 
responding axis, meaning that we move from the state in 
Figure I|.2.9(d) to the state in Figure II.2.9(c). 

This picture indeed allows us to make the projections on 
the axes which give the probability amplitudes while the 
measurement outcome labels the axis, and they also tell 
you what the collapsed state looks like. 


Indeed this graphical representation captures some quant- 
essential features of the measurement process. We will 
make use of it repeatedly later on. 

(iv) The analysis we just presented underscores the subtle 
meaning of the ‘state vector’ or wavefunction. Indeed it is 
important to always keep in mind that it is as much defining 
a state as it is a probability amplitude, which means a way 
of encoding probabilities of measurement outcomes of any 
given observable. 

(v) Bearing the previous points in mind there is an addi- 
tional remark to be made at this point. Did we make a 
measurement or not necessarily? When we put a screen 
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853  Barbie’s choice. Let us now rephrase 
\ K the measurement process in the lan- 
guage of the Barbie on the globe, the 
representation of spin space we introduced in the 
figure on page 285 . The geometry is now some- 
what different: we first have the spin in a certain 
state, which means that the Barbie is located at 
some point on the 2-sphere and pointing her nose 
in a certain direction of the tangent plane at her lo- 
cation. 
Making a measurement amounts to choosing an 
orientation in the X,Y,Z space, which we can 
mark as a line through the center and intersect- 
ing the unit sphere into two antipodal points on 
the sphere. The intersection of the positive direc- 
tion with the sphere corresponds to the positive 
eigenvalue eigenstate, and the negative intersec- 
tion corresponds to the negative eigenvalue eigen- 
state. Indeed the choice of observable determines 
the eigenstates up to a phase factor. So, staying 
within the narrative, choosing the orientation of the 
detector corresponds to installing two inspectors at 
the corresponding antipodal points on the sphere. 
These inspectors do not look in any specific direc- 
tion, they just search around and try to spot the Bar- 
bie. Once they have spotted her they both call to 
her (the sphere is of course transparent — a crystal 
three-sphere...) and order her to report immediately 
at their place. Barbie doesn’t quite know who of the 
two to choose, but she makes a choice, it doesn’t 
matter who Barbie chooses as long as the probabil- 
ities are in accordance with her little quantum cal- 
culation. The inspectors go home and leave her on 
the spot she happened to choose. That’s the state 
she ends up in, and that was what the measure- 
ment was. E 


and record the electron hitting the screen we surely have 
made a measurement of its spin. But you may also imag- 
ine an experiment where we do not register (or measure) it 
explicitly, but think of the experiment as a way to select the 
initial spin state for some other experiment that makes use 
of the upper or lower beam. Then it is clear that the Stern 
Gerlach devise is used as a preperatory device to select 
an incoming spin state. 


And that naturally accommodates the fact that the state 
alters after a measurement, because the information we 
gather from the measurement may drastically affect the 
probabilities. It is not that we as observers play a role, 
because we may or may not look at the results, it is the 
interaction that has or has not taken place between the 
apparatus and the system, which matters. 


Interference: double slit experiments 


An important property of waves is that if we combine two 
of them their amplitudes are added together and we get 
interference: in places where the waves are in phase the 
combined wave gets a maximal amplitude and where they 
are out of phase they will compensate resulting in a re- 
duced amplitude. 


C — D 


a’ 


Figure II.3.16: Interference. A ‘sound’ interference experiment, 
due to Georg Hermann Quincke, which demonstrates the inter- 
ference of sound waves. Image from a 19th century high school 
book on physics. 


The interference of sound. A simple demonstration of 
classical interference can be given with the sound experi- 
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Figure 11.3.17: Two pointsources emitting waves. The two 
sources are 4 wavelengths apart and are in phase. 


ment devised in the nineteenth century by Georg Hermann 
Quincke as shown in Figure II.3.16. In the modern guise a 
tone is generated with small loudspeaker at the point a on 
top, the sound (air pressure) wave splits and propagates 
through both the left and right tubes. They come together 
again at the point a’ at the bottom, where the two waves 
interfere. The difference in length between the left and 
right paths can be adjusted so as to obtain constructive or 
destructive interference. In the latter case a microphone 
positioned at a’ would not register any sound. The cru- 
cial thing is that the total signal at the microphone is built 
up from the various amplitudes along the two independent 
paths and in that sense this is really a kind of double slit 
experiment. 


Wave interference from two point sources. Figure II.3.17 
shows two point sources emitting circular waves which are 
in phase. The two individual wave patterns overlap and 
will therefore interfere, meaning that at certain points the 
signals will amplify each other and in other points they 
will cancel. A new pattern of maxima and minima will de- 
velop. In Figure II.3.18 we show the pattern of water waves 


Figure II.3.18: Water waves. Two slits act as sources emitting 
water waves that interfere. This illustrates the geometric con- 
structions displayed in the following figures. 


generated by two point sources that oscillate in phase (al- 
most). The pattern is obtained by literally adding up the 
amplitudes of the two individual spherical patterns com- 
ing from the two slits which act as point sources, incoming 
are the plain waves from below and this makes that the 
two sources oscillate in phase. So this is indeed a double 
slit experiment and we see that the resulting pattern has 
a number of striking features. We roughly see rays of out- 
ward moving waves with indeed an amplitude that varies 
depending on the angle. 


In Figure II.3.19 we give the theoretical reconstruction of 
the situation combining the two previous figures. In the 
top half of this figure we could mark the points by the 
path difference from the two sources (which equals an in- 
teger times the wavelength) and then connect the points 
with equal differences, as we did in Figure II.3.20. What 
we obtain are the orange colored hyperbolic rays along 
which the maximal amplitude oscillations propagate up- 
wards. In between we could have drawn the zero ampli- 
tude node lines connecting points where the difference is 
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Figure II.3.19: Double slit interference. The slits act as sources 
emitting semicircular waves that will interfere. Compare the pat- 
tern with the water wave interference pattern of the previous fig- 
ure. 


a half-integral multiple of the wavelength. The pattern of 
rays that emerges is not entirely obvious, because there is 
no such thing as ‘adding’ rays; you add the wave patterns 
and then construct the resulting ray pattern. 


Once we have the pattern of rays we could also draw the 
new wavefront picture. These correspond to the blue ellip- 
tic curves in Figure 11.3.21. Note that indeed the rays and 
wave fronts are orthogonal in any point where they meet. 
Rays and wavefronts always form what is called two or- 
thogonal families of curves. What you will see is that these 
wave fronts move outward. So what is the picture along 
any one of these wave fronts? It crosses a fixed number of 
maximal amplitude and node rays and these rays stay fixed 
in time. Therefore we would encounter a one-dimensional 
standing wave pattern along the wave front, and that is 
what is nable nable in the water wave picture II.3.18. 


The interference of light. In Figure II.3.22 we have de- 
picted the classical experiment of Young in which he showed 


Figure 1|.3.20: Rays. The orange maximal amplitude rays con- 
nect points that have distances to the sources which differ by a 
certain integer times the wavelength. 


the interference of the light going through the two slits. It 
only occurs if both slits are open. If only one slit is open, 
one gets a single maximum comparable to that of clas- 
sical particles. The result was fully consistent with Huy- 
gens’ principle of light propagation following from the wave 
nature of light. Comparing this experiment with the pre- 
vious one on sound waves it is clear the sound measure- 
ment only corresponds to a single point on the detection 
screen for the light. Moving the trombone arm on the left of 
Quincke’s device corresponds with moving the light detec- 
tor up and down the screen, which is necessary to probe 
the minima and maxima of the interference pattern. 


The non-interference of marbles (classical particles). 
In Figure II.3.23 we see a source shooting particles (say, 
marbles) in all forward directions. Most get absorbed by 
the screen but if they are directed to one of the two slits, the 
particles can get through. If we count the number of parti- 
cles hitting the detector screen, we typically get a distribu- 
tion with two single maxima as indicated in Figure 11.3.23. 
This is exactly what one would expect: there is no inter- 
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Figure 11.3.21: Wave pattern. If we draw the elliptical curves 
orthogonal to the rays, we do not get the familiar wave fronts 
where the phase difference is constant. Along the curves one 
obtains a standing wave with varying amplitude and wavelength. 


ference of marbles, let alone that a marble would interfere 
with itself. 


The self-interference of a quantum particle. In Fig- 
ure II.3.24 we have sketched what happens with a beam 
of quantum particles such as electrons or protons or neu- 
trons when they hit a screen with two narrow slits. The 
quantessence is that it does not repeat the pattern of the 
classical particles of Figure 11.3.23 but rather that of light 
depicted in Figure 11.3.22. This fundamental experiment 
demonstrates the wavelike nature of particles in the quan- 
tum domain. The most remarkable, really quantessential 
aspect of this behavior is that the phenomenon is not a 
consequence of different particles in the beam interfer- 
ing with each other. This would make it a collective phe- 
nomenon, but no, the truly remarkable fact is that if you 
shoot the electrons one by one, then the interference pat- 
tern would slowly build up as is shown in Figure II.3.25. 
This implies that each electron somehow interferes with it- 
self, and one has to conclude that each electron has ‘knowl- 


Figure II.3.22: Young’s experiment. The double slit experiment 
for light as performed by Young to demonstrate the wave char- 
acter of light, thereby confirming Huygens’ theory of light. On 
the right the varying intensity of the light on the screen due to 
the interference. 


edge’ of the probability distribution as a whole. 


This is indeed the case in quantum physics, as the wave- 
function of the particle is exactly the probability amplitude 
for finding it in any place at any time. Alternatively you may 
say that in quantum theory you could calculate the proba- 
bility for distinct paths from the beginning to any endpoint 
on the screen separately, then the total amplitude from the 
beginning to that given endpoint is the sum of those am- 
plitudes. It is the linear superposition principle in a differ- 
ent guise. Let us go one step further and assume that 
the state pı (x) describes the wavefunction for the con- 
figuration with only the left slit open, and w2(x) with only 
the right slit. The (normalized) wavefunction for the ex- 
periment with both slits open would then correspond to 
Wx) = (p1(x) + 42(x))/V2, as we just have to add 
the amplitudes. The probability of finding the particle on 
a screen behind the slits is then not the same as the sum 
of the probabilities of the individual left and right slit experi- 
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Figure 11.3.23: Marbles don't interfere. In the double slit ex- 
periment with classical particles, the number density of particles 
hitting the detector screen has two separate maxima and there 
is no interference. 


ments, because squaring the total amplitude, yields 


1 


p) = 3P) + p200) +I), 


where the interference term I(x) is defined as 


I(x) = E A) (3-1) 


This is basically the one-particle quantum interference ef- 
fect, a direct consequence of the particle-wave duality in 
quantum physics. 


In talking about quantum interference we should appreci- 
ate that a single particle is described by a wave pattern that 
may or may not be considered to be composed of different 
components, and therefore a particle can ‘interfere with it- 
self’ because of the linear superposition principle. And that 
is what makes quantum interference a truly quantessential 
phenomenon. 


At this point there is an additional remark | would like to 
make. The question whether or not an interference pattern 


Figure 11.3.24: Electrons are not like marbles. The double slit 
experiment showing two conceivable paths that a quantum par- 
ticle like the electron may have taken. The variation in the in- 
tensity pattern on the screen demonstrates the wave nature of 
quantum particles. 


for the quantum particle will appear depends in a subtle 
way on what the experimental setup is. For example, look 
at the experiment of Figure II.3.26, where we have intro- 
duced a source which emits pairs of entangled particles; 
and particle 2 goes to the left and may or may not be de- 
tected, while particle 1 goes to the right in the direction of 
the double slit. The question is whether or not we will see 
an interference pattern as in Figure II.3.24. The answer is, 
that whether we will or will not see interference depends 
on the state of particle 2, irrespective of whether we actu- 
ally measure particle 2! It is the mere possibility of iden- 
tifying the path that particle 1 has taken that destroys the 
interference pattern. The state of the entangled particles 
is basically, 


hp) = Areas ens +|greeny)|greenz). (11.3.2) 


v2 


The interference term for particle 1 would come from the 
red-green cross term appearing in (php) evaluated along 
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Figure II.3.25: How particles make a wave pattern. Buildup 
of the interference pattern of Figure II.3.24, from the successive 
hits of single particles (like electrons) on the screen. 


the screen: 
Ig = |(red;|greeny)| |(redz|greenyz)| , 


and this term containing the self interference of particle 1 
in the first factor will vanish if the second factor for par- 
ticle 2 vanishes because the |redz) and |greenz) states 
are orthogonal. Orthogonality here means that they have 
no overlap: (red2|x)(x|green2) = 0 for all x. If they are 
not, (some) interference will result, but as you see this re- 
ally depends on the actual setup of the experiment. As 
entanglement with the environment can easily take place, 
sufficient care has to be taken if one wants to demon- 
strate quantum interference effects. Physicists have gone 
one step further by investigating the effect of erasing the 
tracking information of particle 2, and they have shown 
that if you succeed in constructing a quantum eraser in 
your setup, the interference pattern will emerge. These in- 
between cases have been investigated in many different 
types of experiments. We will discuss one such experi- 
ment for photons shortly. 


Figure 11.3.26: ‘Which path’ information. No interference of 
particle 1 (moving to the right) if it is entangled with particle 2 
and thus a path identification would be possible in principle by 
measurement of particle 2. 


It is the non-commutativity of observables that gives rise to 
the intricacies in the quantum theory of measurement. The 
predictions of quantum mechanics are intrinsically proba- 
bilistic yet the theory is essentially different from classical 
probability theory. On the one hand it is clear that a given 
operator defines a probability measure on Hilbert space; 
however as the operators are non-commuting (like matri- 
ces) one is dealing with a non-commutative probability the- 
ory, and complementary measures. 


A basic interference experiment 


We have illustrated the schematic of a typical quantum 
interference experiment in Figure II.3.27 which compares 
two different states and their superposition in the familiar 
spin or qubit system. 


In the top figure (a) we have a beam incoming identically 
prepared spins that goes through a polarizer in the x di- 
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(a) X polarizer and beamsplitter. 


(b) X = —1 polarized in left channel. (c) X = +1 polarized in right channel. 


[p> = 


(d) Experiment 1: Measurement Z after block- (e) Experiment 3: Measurement Z of left-right (f) Experiment 2: Measurement Z after block- 
ing right channel: p(+1) = p(—1) = 5. interference: p(+1) = 1 and p(—1) =0. ing left channel: p(+1) = p(—-1) =;. 


Figure 11.3.27: Three experiments. Schematic of a typical quantum interference experiment. Adding the red amplitudes of left (d) 
and right (f) gives the purple amplitude of (e). The corresponding probabilities do not add. 
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Figure 11.3.28: Adding probabilities. A different schematic view 
of the three interference experiments of Figure II.3.27 using the 
symbolic notation of Figure II.3.15. Adding the final probabilities 
of the first and second experiment does not give the probability 
of the third experiment. 


rection and is split into a left and right channel with oppo- 
site polarizations as shown in the two middle row figures 
(b) and (c). It is important to bear in mind that what we 
say next applies to each particle individually. The beam is 
just there to allow us to do a series of repeated measure- 
ments in each setup. In the bottom row we have depicted 
the probabilities for three distinct experimental setups, all 
measuring the Z polarization indicated by the blue frame. 
In figure (d) we give the situation if the right channel were 
blocked where we have |p) = hpr) = |—) corresponding to 
the purple state vector, yielding equal probabilities to mea- 
sure plus or minus one: pı (+1) = pr(—1) = 5 . Similarly 
in Figure (f) on the right we have blocked the left channel, 
giving |) = |r) = |+) , corresponding to the red state 
vector in the figure, and we obtain once more pr(+1) = 
Pr(—1) = 5. Finally in the middle experiment of figure 
(e) we have both channels interfere. Adding the probabil- 
ity amplitudes in red of (d) and (f) yields the amplitudes in 
purple of (e). Now we have to consider the (normalized) 
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Figure 11.3.29: Adding probability amplitudes. The same 
schematic view of the three interference experiments as the pre- 
vious Figure I|.3.28. In this figure we give the probability ampli- 
tudes and now one finds that adding the total amplitudes of the 
first and second experiment does give the amplitude of the third 
experiment. 


superposition |p) = zib) + hpr}) = |1) correspond- 
ing to the purple arrow in figure (e), so that the probability 
distribution becomes p(+1) = 1 and p(—1) = 0. This is 
notably different from the sum of the probabilities of cases 
(d) and (f) which would give (+1) = }(pr(£1) +pr(+1)) 
yielding once more (+1) = f(—1) = 4 . The differences 
between p(+1) and p(+1) are indeed due to the interfer- 
ence terms I(+1) = +4 and I(—1) =—}. 


In Figures 11.3.28 and II.3.29 we present an alternative vi- 
sualization of the same three experiments using the sym- 
bols $ and & introduced in Figure 11.3.15 for the polarisor 
settings. The left three panels give the probabilities and 
one sees that they don’t add up, while in the right three 
panels we give the amplitudes and one sees that they do 
add up. Confirming our expectations for the interference of 
the spin polarizations. 
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A delayed choice experiment 


A modern and clean quantum incarnation of the canoni- 
cal double slit experiment is the interference experiment 
using a so-called Mach—Zender interferometer. In such 
a device the self-interference of quantum particles/waves, 
and in particular photons, can be beautifully demonstrated. 
This setup is also called a ‘delayed choice experiment’ af- 
ter a gedanken proposal of John Archibald Wheeler, or a 
‘which-way experiment’. The delayed choice refers to the 
fact that the decision which experiment one is going to do 
is taken after the incoming particles have gone through the 
first polarizer thereby having chosen one of the two paths 
or both. In this clever setup the device randomly chooses 
between: 

(i) a ‘which way’ experiment where one identifies the path 
which the particle has chosen and thus no interference will 
take place, or 

(ii) a mode where the information on ‘which way’ is erased 
and one expects interference. 


In Figure II.3.30 | have sketched the schematic of such an 
experiment! by the French group of Alain Aspect, who has 
pioneered this type of experiments. It consists of two com- 
ponents, first an input part on the left where the polariza- 
tions get split. Next the photon travels over a considerable 
distance of maybe 50 meters (but recently distance of kilo- 
meters have been achieved). Finally the photon enters the 
output part (on the right) where one measures whether the 
photon has interfered with itself or not. The two compo- 
nents are space-like separated,” so that there can be no 
causal relation between the decision taken in the output 
part and preparation of the photon in the input part. 


Single photons enter the interferometer on the left where 
they go through a polarizing beamsplitter. The horizon- 


Iv Jacques et al., Science, Vol 315 (2007). 
?Space-like separated means that the output component is outside 
the future and past light cones of the input component. 


Figure II.3.30: Delayed choice. A Mach—Zender quantum in- 
terference device, involving two polarizing beamsplitters of the 
type shown in Figure II.3.13, which demonstrates the quantum 
interference of photons. 


tally polarized component goes up and the vertically po- 
larized goes straight through. Reflection by the mirrors 
does not change the polarization. Then the signal trav- 
els some distance. The A/2 plate with its axis under 45° , 
flips the horizontal and vertical polarizations. This is nec- 
essary to allow for the beams to be joined by the second 
beamsplitter. They traverse the reversed path, so in fact 
the second splitter acts like a ‘joiner’. By tilting the ‘joiner’ 
one can also introduce a phase difference ~ between the 
vertical and horizontal component, where the vertical am- 
plitude becomes e'?/* and the horizontal e*®/* . The fur- 
ther encounter depends on the random number generator 
(RNG) which decides on whether or not to effectively insert 
another A/2 wave plate. 


Let us first assume the plate is not inserted, then the pho- 
ton reaches another beam splitting prism that sends the 
horizontal polarization up to detector D; and the vertical 
polarization down to detector D2. Furthermore, there is a 
device that determines whether the detectors 1 and 2 fire 
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A/2 plate 
|1> 


|+> 


Figure II.3.31: The A/2 wave plate. Effect of the A/2 wave with 
its principle axis under 22.5° with the vertical line. The compo- 
nent orthogonal to the principal axis changes sign (phase= -1). 
The result is that |v) = |1) > |+) and |h) =|—1) > |-). 


simultaneously. So the beauty of the setup is of course 
that all the counts in the detectors are recorded as well 
as the random time series for the presence of the second 
plate, and then a posteriori one calculates what has hap- 
pened. Clearly in this mode, the polarization of the pho- 
ton entering the prism carries the information about which 
path the photon has taken. The D; detects only the pho- 
tons that came along the lower path, and D2 detects only 
the photons that took the upper path. And indeed no in- 
terference is observed as is clear from the lower graph 
in Figure 11.3.32. The amplitudes do not add up, and the 
probabilities are 1/2 and independent of the phase ọ . The 
punchline here is that the whole setup in this mode just 
‘measures’ which path the photon has taken. And knowing 
that path the photon is just a particle and no interference 
is to be expected. 


In the other mode of the interferometer, an additional A/2 
wave plate with its axis under an angle of 22.5° is inserted. 
This has the effect that the polarizations are flipped as Fig- 
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Figure 11.3.32: Single photon interference. The counts in the 
detectors D2 (red) and D; (blue), with and without interference. 
The top graph gives the count with the wave plate in the output 
channel and the bottom graph without. Taken from V. Jacques 
et al., Science, (2007). 


ure II.3.31 illustrates, so that Figure 11.3.31 so that |1) — 
|+) and |—1) — |—). The important thing is that when the 
photon enters the final prism the components of different 
paths will mix again, the amplitudes will add and interfer- 
ence will occur. This is of course assuming the photon took 
both paths, which is what quantum theory predicts. 


So the vertical and horizontal amplitudes become: 


ay = ~(e?/2 + e 19/2) = cos = 


(eŻ?/2 — e 10/2) = isin 5 i 


Ni ANI 


Xh = 


Thus, the probability for counts in D2 becomes cos? (¢/2) = 
5(1+cos @) and that for counts in Dı equals sin? (¢/2) = 
5(1 — cos @). And this prediction is beautifully confirmed 
by the data plotted in the top graph of Figure II.3.32. A 
single photon interferes with itself, something more quant- 
essential is hard to imagine. 
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Figure II.3.33: The Aharonov-Bohm phase factor. If a charge 
q encircles a magnetic flux ® , the quantum state of the particle 
will acquire a phase factor W = exp iq® /ñc. 


The Aharonov-Bohm phase. 


Relative phase factors are all-important in quantum theory 
and lead to quantessential observable phenomena. 


One important example that comes back in many guises is 
called the Aharonov-Bohm phase-factor. The correspond- 
ing effect is caused by inserting magnetic flux filament in 
the one electron double-slit interference experiment. The 
extra phase that results is due to the line integral of the 
gauge potential A along a closed loop, which we intro- 
duced already in the section on classical electrodynamics 
of Volume | in equation (1.1.52) and Figure 1.1.27. 


Let us recall that if we are in a medium where there is some 
electromagnetic potential and | have a charge q which | 
move along a path y from xo to x; , then the state vector 
or the wavefunction for that matter will be transformed by 


Figure 11.3.34: Path-independence. The phase factor does not 
change under deformations of the path, as long as the region in 
between the paths is free of magnetic fields. 


a phase factor: 


w(x1) = W(Y; x1, Xo) W(xo) , (II.3.3a) 
W(y;x1,X0) = exp (it (ec A-dl). (II.3.3b) 


Here in the integral you take at every point along the path 

the component of the vector potential directed along the 
path. The outcome will in general depend on which path 
you choose. This phase factor is an interesting object, 
and we should pause for a moment to understand it bet- 
ter. 


Firstly note that it is what we call ‘non-local, and under a 
gauge transformation U(x) it transforms like 


W(¥3x1,X0) > U(x1)W(x7, xo)UŤ (xo). 


If we close the loop, then the phase-factor becomes gauge 
invariant, because we get Ut (x9)U(xo) = UTU = 1, the 
transformations act at the same point and therefore cancel 
out. 
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What does this non-local gauge invariant quantity mean? 
To understand that we go back to classical electrodynam- 
ics, and you have the simple property called Stokes’ law, 
which tells us that if you calculate the line integral of A 
around a closed loop y, then you get the magnetic flux 


through (any) two-dimensional surface bounded by the loop. 


So this means that the loop operator Wy ‘measures’ the 
magnetic flux: 


Wy(q; ®) = etq9/ħc ) 


which is indeed a gauge invariant quantity, as it should be. 
Let us now go to a two-dimensional situation to simplify 
the picture, and imagine we have a well-defined narrow 
magnetic flux tube piercing through the surface as in Fig- 
ure II.3.33. If we adiabatically move a charge around the 
flux ® the state will change according to, 


Iq, D) = Wy(q, ®)\q, D) : 


In Figure II.3.34 we show the effect of deforming the con- 
tour or loop doesn’t affect the outcome as long as we do 
not cross magnetic flux lines. In field free regions you 
can deform the loop arbitrarily. Also, if you first go one 
way around the flux and you subsequently go back around 
some other loop encircling the flux in the opposite way, the 
net effect will be zero. 


The beauty of this story is that one can directly measure 
this gauge invariant phase factor W in a one particle quan- 
tum interference experiment. It is called the Aharonov— 
Bohm effect, after the two theorists who proposed it in 
1959 with reference to earlier work by Ehrenberg and Si- 
day. The setup of the experiment is given in Figure 11.3.35. 
The gauge and path independent extra phase factor Wy 
appears as a relative phase factor between the wp; and w2 
factors in the interference term defined in equation (II.3.1), 
causing the observed shift of the interference pattern shown 
in the figure. 


3And maybe this credential ambiguity explains why there was no 
Nobel prize awarded for this fundamental effect. 


Figure 11.3.35: Path-independence. The presence of a mag- 
netic flux filament between the slits causes an extra phase dif- 
ference between the two paths. This leads to a shift of the inter- 
ference pattern from ‘red’ to ‘blue’ as indicated. 


Phase shift due to magnetic flux. I 
Let us find out how this happens. We start with the free 
particle Hamiltonian and then include the coupling to the 
electromagnetic field through the vector potential A, as 
we did in equation (1.1.44). This amounts to replacing the 
derivatives V by covariant ones D = V +iqA/ħc: 
K a K a 
H= mY H= mP ; 

Suppose we have solved the problem with A = 0 corre- 
sponding to w(x) and wp2(x) . We want to find out what 
changes if we take A + 0. Consider the covariant deriva- 
tive working on any function, then we have the following 
equality: 

Dw (x) = V (exp (it | ' Adx) p(x) 

c Jyo 

The only way the coupling to the A field manifests itself is 
through the phase factor W. In other words the solutions 
are linked as follows: 


We (x) = W* (yx xop (x) ; i=1,2. 
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The phase factor looks awkward in that there at once ap- 
pears a point xo and the line integral along a path y; from 
Xo to xı. But the identity holds for any choice of xọ and 
may depend on y, as will become clear. 


Now return to the interference term I(x) defined by equa- 
tion (II.3.1). One chooses for xo the position of the source, 
and for Ypi the path yı has to be chosen to pass through 
the first slit and for UA a path y2 through the second slit. 
Then the first term of I(x) involves the product: 


yy) O92) 

= W(y1, X, xo) W* (Y2; x, xo) etl 82" D hpi Odlhb2(0)I , 
where the the B:i(x) are the phases of A = 0 solution. Note 
that W*(y;x, Xo) = W(Y; Xo, x), in other words the conju- 
gation reverses the path, but then the product of the two W 
factors yields a closed path through both slits encircling the 
magnetic flux giving the overall phase factor W,(q, ®) = 
e'd®/Rc_ Putting it all together we obtain: 


I(x) = cos (42 + B6) hi (odlhb2bl, 


with B(x) = B2(x) — Bi(x). 

What this calculation shows is that the position dependent 
phase (x) corresponding to the A = 0 gets shifted by 
an amount proportional to the flux-charge product. This 
shift is constant; it does not depend on where you are, 
which means that the interference pattern generated by 
B(x) gets shifted as a whole, as we have indicated in Fig- 
ure II.3.35. We will return to these Aharonov-Bohm phases 
on page 416 of Chapter 11.5, where we talk about exotic 
particle spin and statistics properties in two dimensions. 
| 


Why is this an important effect? This experiment shows 
a really interesting aspect of electrodynamics. The elec- 
trons in this experiment are shielded from the flux. They 
only travel through regions of space where both the electric 
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Figure II.3.36: Super phase (A). Phase of the superconducting 
condensate. This is the ground state with the trivial constant 
phase equal one. This configuration has winding number n = 0. 
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and the magnetic fields E and B are strictly zero. The vec- 
tor potential A is non-zero but it is a gauge dependent field 
and therefore not a local physical observable like the other 
fields. In fact locally it is a gauge transform of the vacuum, 
in other words locally the gauge potential can always be 
gauged to zero! And yet, there is an observable effect! The 
clue is that there is this subtle nonlocal gauge invariant ob- 
servable which involves only the vector potential, namely 
the loop integral, its value if non-zero cannot be gauged 
away. This means that if you would like to transform the 
gauge field to zero everywhere that transformation would 
not be single valued and therefore not be a proper gauge 
transformation. It is this gauge invariant observable that is 
measured in this quantessential experiment. 


Flux quantization in a superconductor. Let me point out 
another crucial ‘application’ of this argument in the context 
of superconductors, in particular type Il superconductors. 
The defining property of superconductors is that their re- 
sistance is zero. If you were to move a piece of supercon- 
ducting material in a magnetic field, super currents would 
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Figure II.3.37: Super phase (B). Phase of the superconducting 
condensate. This is again the trivial ground state but where the 
phase has been changed by a local gauge transformation, but 
the winding number is still zero. 


start running so as to expel the magnetic field lines out the 
superconductor. This is the Meissner effect. In the type 
Il superconductors, it is possible for flux lines to enter the 
medium, but only if the amount of flux satisfies a certain 
flux quantization condition. The situation is very similar to 
what we are discussing here: there is a superconducting 
ground state that corresponds to a condensate of pairs of 
electrons. These Cooper pairs have charge 2e, and the 
medium has no electromagnetic field except for the fila- 
ments. The condensate is static and effectively described 
by a complex scalar field p(x) that is doubly charged and 
carries an electromagnetic phase factor. To say that the 
pairs are condensed means that in that case the field ac- 
quires a constant non-zero magnitude, and because it de- 
scribes the ground state it is called a vacuum expectation 
value. We write |tp(x)| = 1 and w is described by a pure 
phase factor with angle B(x). In Figure 11.3.36 we have 
plotted the local phase ß of the condensate in the ground 
state. The gauge field is in this case globally gauged to 
zero and the corresponding phase is trivial, B(x) = 0. In 


Figure II.3.38: Super phase with flux (A). Phase of the super- 
conducting condensate with a magnetic fluxtube in the center in 
the so-called radial gauge. The phase rotates by 27 after encir- 
cling the flux once along a closed curve like the green one in the 
figure. The winding number of this configuration equals n = 1. 


Figure II.3.37 we have made a local (x-dependent) gauge 
tranformation which changes R — B + A(x). If we fol- 
low the phase along a closed curve like the green one, the 
phase will change forth and back, but the net change af- 
ter returning to the initial point remains zero. We say that 
winding number of the configuration is n = 0. This winding 
number is not just gauge invariant. It a topological invari- 
ant, which means that it cannot be changed by any smooth 
transformation of the gauge potential or the phase f(x). If 
we follow that phase around a magnetic flux line, the state 
should certainly return to the same value. It should be sin- 
gle valued because it is macroscopic state describing the 
condensate of Cooper pairs. We discussed this briefly in 
Chapter Il.1 when discussing the Josephson effect. The 
upshot is that only fluxes are admitted that are ‘invisible’ 
for the medium, or the condensate. In other words, we 
want the induced phase factor to be equal one, which im- 
plies: 


apa 0) =] 


2eM = 27nh 
Ac = 2e mmc , 
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Figure 11.3.39: Super phase with flux (B). Same physical situa- 
tion as in the previous figure after a local gauge transformation, 
the winding number did not change. 


which means that the flux is quantized according to: 


he 

2e , 

and qo is called the fundamental flux quantum, which is 
expressed directly in fundamental constants. In the Fig- 
ures II.3.38 and 11.3.39 we show the phases of the conden- 
sate after a flux tube has entered. Moving along the green 
curve encircling the flux, the phase changes by 27 . This is 
clear in the radial gauge of the first figure but remains true 
after a gauge transformation has been applied. 


© =n, with dp = 


This quantization rule is exactly what has been observed 
in type Il superconductors. A flux tube has a negative sur- 
face energy and therefore an arbitrary flux likes to decay 
in individual minimally quantized filaments. These repel 
and therefore, if there is a strong magnetic field causing 
many tubes, these will form a lattice, a two-dimensional tri- 
angular crystal. If you keep turning up the magnetic field 
strength, the pairs will break up and therefore the super- 
conductive state will break down at some critical value for 
the magnetic field. 


The Berry phase F 
You may ask whether it is really possible to ‘drag’ a state 
vector along a closed loop like we described and whether 
the resulting phase change can be measured? The an- 
swer is affirmative. In this subsection we will discuss the 
Berry phase which is a substantial generalization of the 
Aharanov—Bohm phase, named after the British mathe- 
matical physicist Sir Michael Berry who discovered the pos- 
sibility to measure holonomies in certain experimental set- 
ups with a well-chosen time or space dependent Hamilton- 
ian. 


The question is how to translate the rather abstract pic- 
tures of parallel transport into a suitable experimental set- 
up. The idea behind Figure 1.2.32 is clear: there is an 
‘agent’ carrying the state vector, and by moving through 
space the frame changes and therefore the parallel trans- 
ported vector appears to be rotated with respect to the ini- 
tial local frame. 


In the qubit or spin-one-half context you may think of the 
agent as an electron carrying a qubit (spin-one-half spinor) 
around. If we apply a magnetic field, the spin will align or 
anti-align with the external field as that minimizes the inter- 
action energy. The ground state of the spin depends there- 
fore on the orientation of the magnetic field. So to get the 
spin to move through its state space, we should move the 
electron in real space through an inhomogeneous mag- 
netic field or we should fix its position and change the field. 
And by walking around along a closed loop in real space- 
time we may find the state of the spin is rotated by some 
phase angle. In other words, due to the inhomogeneous 
magnetic field, a closed loop in space-time gets mapped 
onto a smooth path in state space that is not necessarily 
closed. 


In fact Berry took the approach where he looked at a time- 
dependent Hamiltonian H(t). We have said that the time 
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evolution of a state is generated by the Hamiltonian. If 
the Hamiltonian is time=independent and the system is 
in an eigenstate of that Hamiltionian, then the time de- 
pendence is the time dependent phase factor |ny(t)) = 
exp(—iEn(t — to)/A)lpn(to). The question is now what 
happens if the Hamiltonian becomes time dependent. You 
can think of the Hamiltonian having a set of parameters 
{ci}. For example, if we consider the coupling of a spin 
to an external magnetic field, the parameters would cor- 
respond to choosing the direction and the strength of that 
external field. And the time-dependent Hamiltonian we are 
interested in would be one where we slowly vary these pa- 
rameters: H(t) = H({ci(t)}) . So the experiment is set up 
to see what happens if we make a round-trip through this 
paramater space or the space of Hamiltonians. The Ha- 
miltonian moves between to and ts along a closed path 
in parameter space so that H(to) = H(t). The choice 
of this path is of course made by the experimenter. In 
Figure II.3.40 we have depicted a time-dependent closed 
path (pink) through a two-dimensional coordinate space of 
Hamiltonians where c(to) = c(t). The figure also shows 
the yellow path straight up, corresponding to the time inde- 
pendent Hamiltonian H = H(to) , leading to the aforemen- 
tioned phase factor exp(—iEn(t — to)/A). 


The expression for the phase factor. We assume that 
the Hamiltonian H(t) has a time-dependent discrete spec- 
trum: 

H(t)|n(t)) = En(t)In(t)) 


If we now assume that we vary the Hamiltonian slowly 
so that the system smoothly (adiabatically) evolves in the 
state |n(t)) , we can construct an approximate solution; 
i t / / 
hp(t)) = Cr(t) exp oa E,(t’) In(t)) dt’) , 
to 

and because p(t) and |n(t)) are both normalized the co- 
efficient C,(t) can only be a phase: 


Cr(t) = exp(iyn(t)) 


Figure II.3.40: Berry phase. A closed (circular) path in Hamil- 
tonian space with coordinates c = (c),c2). The system follows 
the pink curve in time such that H(to) = H(t¢). 


We can substitute this solution into the time-dependent 
Schrödinger equation, 


0 


iA 
ot 


w(t) = Hy(t) 


to obtain an equation for the phase: 


dyn(t) _ _ d|n(t)) 
Ot mi) =e 
which has the solution 
ya(t) = if. (n(er ay 


Berry connection and curvature. To give this phase a 
direct physical interpretation let us look at the integrand 
and ask what we mean by the state |n(t)) . The time de- 
pendence is not the time dependence of n but rather of 
the state labeled by n. The time dependence all comes 
through the changing of the parameters c;(t). The appro- 
priate notation is in fact to write |n(t)) = |n;{ci(t)}) = 
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In; c(t)) where | combined the parameters into a vector, a 
position vector in parameter space. If | now take the time 
derivative of the state, | may rewrite that as follows: 


ð In; ¢(t)) 
ot 


dc(t) 
5 


= V. In;c(t)) - 


where the nabla operator is the vector of 0/0c; derivatives. 
In other words the gradient operator acting on functions of 
the parameter vector. 


This turns the time integral for the phase into a loop in- 
tegral on parameter space over a connection (or pseudo 
gauge potential) C(c) named after Berry: 


Yn = $ (nce In;c)- de = $C ac 


In other words, the phase factor using Stokes’ theorem can 
be expressed as a surface integral of the corresponding 
Berry curvature F = V, x C: 


yn = $C: de= |F- as.. 


There is a striking analogy with the Aharonov-Bohm case, 
but it is also clear that the Berry analysis is much more 
general. 


Spin coupled to an external magnetic field. 


To be more concrete about such an experiment, imagine 
a closed path c(t) in time parametrized by a parameter 
O < t < 1 with c(0) = c(1). The system is an electron 
spin coupled to a slowly varying external magnetic field 
B(t) , with a Hamiltonian 


ahermitean 2x 2 matrix acting on the two-component elec- 
tron spin. 


Let first ask what the space of Hamiltonians looks like, 
which is asking for a natural parametrization of all mag- 
netic fields. 


The field B(t) has some direction and some magnitude. 
As shown in Figure ll.3.43 we choose spherical coordi- 
nates in B space. So the direction is parametrized by the 
angular coordinates © and @, while the magnitude is given 
by the radial coordinate. If we only change the direction 
of the external field, the space of possible Hamiltonians 
would just correspond to the radial magnetic fields on a 
spherical surface of constant radius. Note that this looks 
like the field surrounding a magnetic monopole as we have 
drawn in Figure 1.1.29. 


The starting point with the Berry phase experiment is to 
choose the time path that gets mapped onto some closed 
curve c(t) in the space of Hamiltonians, thus on the two-- 
sphere in this case. 


The adiabatic change or ‘dragging’ of the state amounts to 
parallel transporting a frame (of the tangent plane) along 
the curve, like we discussed in Chapter 1.2 in the section 
on geometry. 


As we will show shortly, the result for the acquired phase 
will depend on the solid angle that the path H(t) has cov- 
ered on the sphere.* This means that the Berry phase is 
a purely geometric phase (in fact a holonomy) which de- 
pends on the geometry of the space of Hamiltonians, but 
also on the probe (in this case a spinor). 


The idea is simple: at t = 0 we start at the North Pole with 
the Hamiltonian H(0) = BZ and the energy eigenstates 
correspond |tp,(0)) = | +1). Next we start rotating the 
magnetic field and we assume that the initial eigen spinor 
just follows. In that sense it is fair to say that the Berry 


“The path is oriented and the orientation decides whether to take 
the solid angle w or 47 — w, which with equation (11.3.4) amounts to 
Rx (8) =} Rx (—8) : 
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phase probes the Hamiltonian space but also the spin or 
qubit space which is a three-sphere S? as we know. 


To work this out in more detail for an electron spin or a qubit 
for that matter we first look at how the rotations act on the 
spinors and then we find a convenient parametrization of 
the magnetic field space. E 


II 


To better understand what | mean by ‘probing the state 
space’ of a qubit | propose we return to the ‘Barbie on a 
globe’ representation of the qubit, as we introduced it in 
Chapter II.2 on page 286. What you see there is that we 
represent the qubit as a vector or rather spinor bundle over 
a two-sphere, where a particular qubit state corresponds 
to a unique tangent vector at some point on that sphere. 
And the X , Y and Z operators are generating the ‘motions’ 
of the Barbie in that space. 


Probing the geometry of state space 


Let us first visualize the actions discussed above in the 
Figure II.2. We see the Barbie standing on the North Pole, 
say looking West corresponding to the state —|1) . Acting 
with Z does not affect her at all, but acting with X moves 
her to the state —| — 1) which is the mirror image of the 
initial state through the origin of the tangent space. 


To probe the space in more detail we have to construct 
operators that move the Barbie around on the sphere and 
make her perform pirouettes. What we need are rotations 
generated around various axes, and these correspond to 
exponentials of X, Y and Z. 


Rotation of qubits. As we will explain in more detail in 
the Math Excursion on page 635 of Part Ill on groups, 
this amounts to going from the Lie algebra of infinitesi- 
mal transformations to the corresponding Lie group of finite 
transformations. Here we need the explicit relation for any 
of the Pauli matrices o, = X, Y or Z that we introduced in 


Figure II.3.41: Effect of rotations around Y-axis on Barbie. Ro- 
tating the Barbie state |1) around a large circle in the ZX-plane 
by angles 8 = +n7/2. The rotation angle in this representation 
equals 8. As she only passes through real states, the overall 
phase ß , denoted by the white arrow, is either 0 or 7. Rotating 
over an angle 27 any state goes to minus itself. 


equation (A.34) of the chapter on Math Excursions: 


0 8 0 
R(0) = exp (iṣ50x) =1cos=+ioxsin=, (I1.3.4) 
2 2 2 
which should be compared with its one-dimensional ana- 
logue, the Euler formula (A.28). At this point we recall 
some important observations we made before. 


1. Since the spinor or qubit is a two-dimensional com- 
plex vector, the rotations are relatively simple two- 
by-two unitary matrices which can be given explicitly 
as you see. 


2. These complex rotations form the group SU(2). 


3. The formula for Rg (9) represents a rotation about the 
k-axis over an angle @. That means to say that act- 
ing on an ordinary three-dimensional vector like B , it 
rotates over an angle 9 in real space. That is, under 
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a rotation R;,(8), , the Hamiltonian will rotate as: 


H > H’ =B-o’ =B- RoR! (11.3.5) 


4. However, on the two-component complex state vec- 
tor of a qubit the rotation acts like 


hb) > hp)’ = Rk hp). 


Note that it ‘rotates’ only over half that angle because 
of the factor 0/2 in the formula (II.3.4). 


5. This factor one-half has dramatic consequences. For 
example a rotation by 8 = 27 around any axis pro- 
duces in (II.3.4) just minus the unit matrix! So, under 
such a transformation the qubit state always goes 
to minus itself. One has to rotate by 47t before one 
gets back to the unit matrix. This is indeed a defining 
property for a spinor, to be contrasted with rotating 
an ordinary vector about an angle 27 , which always 
gives the same vector back. This minus sign for a 
spinor has a deep physical significance for particles 
carrying half-integer spin as we will explain later. It 
is one of those minus signs that does matter a great 
deal. 


6. Note that under the rotations the norm of the state is 

preserved 
(pp) = php), 

and this is what we expect as we are moving over 
a sphere, because by taking the complex conjugate 
vector the transformation is going to its hermitian 
conjugate, which means changing i => —i or 0 —> 
—0 , what amounts to the same thing. This means 
that the conjugate rotates by the opposite amount, 
so that the net effect of the rotation on the inner prod- 
uct of vector with itself (or any other vector) always 
cancels. 


To familiarize ourselves a bit with these rotations, let us 
first restrict our attention to real qubit state vectors as in- 


Figure 11.3.42: Effect of rotations on the real circle of qubit 
states. Same situation as previous figure, but now we have plot- 
ted the states at half the angle 0 from —27 < 0 < 2r. In the 
upper half-of the circle the overall phase £ is for real states 0 
and in the lower half it is 7. 


troduced in Figure II.1.7. These states form a real circle. 
The operators Z and X are qubit observables which have 
real eigenvectors. For Z those are +|1) and | — 1) re- 
spectively with eigenvalues +1 and —1. Similarly for X we 
have +|+) and +|—) also with eigenvalues +1 and —1 re- 
spectively. We may ask which operator would move you 
around in that subspace of real states, on that circle. Such 
moves correspond to rotations about the Y-axis, generated 
by Y and indeed, a rotation by an angle 0 yields the real 
matrix: 


0 0 cos? sin 
R„ (0) = cos — + iY sin = = ( 2 3) , (11.3.6) 
y 2 2 8 cos § 


which indeed corresponds to a rotation of the qubit state 
vector over an angle 0/2. In other words, rotations about 
the Y in the (| — 1),|1)) plane move a state around the 


circle. 


So let us find out what the formula yields for rotations over 
multiples of 7 acting on the | + 1) state, and then visualize 
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the results in the two different representations of the qubit 
space corresponding to (i) the ‘Barbie on the globe’ picture, 
and (ii) the real circle of Figure II.1.7. 


Using the formula (II.3.6) we obtain the following values for 
some (—2z < 0 < 27) rotations. For a transformation by 
—n we find: 


Ry(—n)iI) © é a (5) > @ dined, 


and for the others: 
T 
Ry(£5) 11) = IF}, Ry (#7) 1) =+l|—-1), 


Ry 7) |1) =— +); 


Ry (+27) |1) =—|1). 


If we carry a spinor along a large circle over an angle of 
27 we obtain from (II.3.6), just a (ophase)factor minus one. 
We have illustrated the sequence of values just calculated 
in Figure II.3.41 which should be compared with the ‘Bar- 
bie on the globe’ figure on page 286. The rotations for 
increasing values of 8 correspond to the Barbie moving by 
the same angle over the globe, anti-clockwise in the ver- 
tical plane. The states remain real for all © and the only 
phase change that may occur is that it jumps from 1 to —1 
or the other way around. This corresponds to a jump in 
the phase angle of B = +7 depicted by the white arrows 
either pointing up or down in the figures. 


In Figure II.3.42, the same sequence is represented in the 
standard qubit decomposition that we introduced in Fig- 
ure II.1.7, and we see indeed the phase jumping at odd- 
multiples of 8 = +7. 


You may think of this as a holonomy effect, referring to 
the concept we introduced in Chapter 1.2 while discussing 
parallel transport of vectors through curved space, which 
is exactly what we are doing here. If the Barbie parallel 
transports her spinor, it may pick up a phase factor equal 
minus one. So if she starts walking along a big circle on 
the sphere looking straight ahead, she will looking straight 


B(r,8,) 


Figure 1I.3.43: Radial magnetic field. The Hamiltonian land- 
scape. 


back upon returning. What the Figures 11.3.41 and II.3.42 
show is that the Barbie at 8 = +71 suddenly turned her 
head by 180°... 


Magnetic field space. We choose that initially the field is 
in the positive z-direction B(0) = B2 and the spin to be in 
the aligned up |1) state, so, in the n = 1 energy eigenstate. 
We change the direction of the field slowly so that the spin 
stays aligned with the varying external field provided the 
changes are slow. 


From the figure we learn what the x, y and z components 
of B are in terms of the angular variables: 


Bx =BsinOcosp By =BsinOsin@ Bz = B cos 


And thus the Hamiltonian of equation (II.3.5) correspond- 
ing to a point on the sphere i (we set B = |B| = 1) looks in 
matrix form like: 


+i a; 
He =e = Gas e oy l 


1P sin@ —cosd 


The two eigenstates with eigenvalues plus and minus one 
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Figure II.3.44: Magnetic field space. The space of a magnetic 
field of constant magnitude can be represented as the field on 
a sphere around a magnetic monopole. Adiabatic transport of 
een spin-1/2 particle moving along a closed path (in red) in the 
radial magnetic field of a pole of strength g = 27ħ/e centered 
at the origin (blue). 


correspond to the spinors: 
i) o` iE ain 8 
cos 5 e 8 
o= ioa and |—1;c) = a 
e'? sin 5 cos 5 


The adiabatic process. The process of adiabatically mov- 
ing over the sphere corresponds to making rotations around 
the proper axis and angles. So for example to move from 
the North Pole to the point (r,8,@) one may apply the 
transformation(s): 


8 
cos 5 


2 


0 
cos 5 


One checks that: 


H(90, @) = Rg Z Rọ! and |+ 1;c)= R| 1), 


as it should be. E E 


II 


Let now calculate the Berry connection which involves ap- 
plying the V. operator, but the c coordinates are just the 
ordinary three-dimensional spherical coordinates where the 
(angular) components are given by Ve = 0/00 and Va = 
sin! 00/0. The Berry connection is then: 


The Berry connection. 


1 — cos 0 

2sin 0 ae 
This connection is exactly the gauge potential written down 
by Dirac in his famous 1931 monopole paper, and indeed 
its curl give the field of a magnetic monopole with mag- 
netic charge eg/hc = 27. The total magnetic flux through 
the sphere is 27t, which is half the solid angle of the to- 
tal sphere being 47. And thus is the resulting phase after 
closing the loop equal to the magnetic flux going through 
the loop. It is nice to see how nice this subject of the Berry 
phase connects with matters that we discussed in early 
chapters of Volume I. 


C(8, p) = (njelV_ |n;c) = 


Some explicit examples. Let us now consider some spe- 
cific paths and see how this works. In the first example 
we start at the North Pole meaning to say that H = Z and 
hp(t = 0)) = |1). Then we parallel transport vector along 
a geodesic generated by rotating around X-axis over an 
angle 0 = —7 and bring it back along a geodesic gener- 
ated by rotating around the Y-axis by 8 = 7. The path cor- 
responds to the red two-angle indicated in Figure II.3.44. 
This means that the Hamiltonian between t = 0 and t = 5 
smoothly rotates in the YZ-plane from Z = H(0) to Y = 
H(4) . From formula (II.3.4) we see that: 


Rx ( n= iok. 


So the overall (unitary) transformation of the state vector 
corresponds to: 
U = iY(—iX) = —iZ. 


So the net effect on the state |1) after coming back home 
is that it is rotated by an angle 8 = —7/2 around the z- 
axis. So the loop integral would give a magnetic flux of 
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m/2 which is 1/4 of the total flux of 271, which in turn is 
consistent with the fact that the loop covered 1/4 of the 
total solid angle of the sphere. 


So what is the interference effect on the probabilities mea- 
sured, if we start with |»;) and end up at lp2)} = —iZ|p1) ? 
The expression is given by the following equation for any 
outcome of a measurement: 


Iai) = y((biIPAhb2) + pP) 


The outcome is the expectation value of an expression in- 
volving U and PA: 


I(ai) = vbr] (PAU + UIPA) hp;). 


We obtain that in the case at hand by choosing P = P£ the 
result is zero for all |\p1) . However for PY. we find (1p; |Yhp1) 
which of course may or may not be observable dependent 
on the choice of the initial state. 


Another example would be as sketched in Figure 1.2.32. 
There we have three successive rotations by 7/2. 


TE TT TT 


1 
P 


where we used that 


RZ) = =i PERA 


This generates interference in more situations than the 
previous case, and applying it to |\p;) = |1) we get: 


uji) = sll +iZ)|1) = z0 +i)[1 >= e, 


again this is consistent with 1/8 of the total flux. E E 


Quantum tunnelling: magic moves 


In this chapter we have considered the consequences of 
the quantessential particle-wave duality in typical wave type 


Figure 11.3.45: Quantum tunnelling. The lowest energy state 
of a particle in the presence of a potential wall shows that the 
quantum particle is most probably found on the left-hand side, 
but still has a small probability to be on the right-hand side. The 
wavefunction decays exponentially in the wall but still has a non- 
vanishing value when it arrives at the other side. 


phenomena like reflection, refraction, diffraction and inter- 
ference. In this section we turn to the aspect of transmis- 
sion, notably the effect of quantum tunnelling, which is an- 
other stunning instance where quantum theory overrides 
a Classical veto. In the tunnelling process we should think 
of particles that can move through, or jump over a poten- 
tial wall. This happens for example in the spontaneous 
decay of bound systems, and has a great application in 
scanning tunnelling microscopy (STM). Such processes 
are strictly forbidden by classical physics but have finite 
although small (meaning exponentially small) probabilities 
to occur in the quantum situation. It can be looked upon as 
a consequence of the quantum fluctuations in the system 
that ‘follow’ from the uncertainty relations. 


Let us put a quantum particle in a bowl corresponding to 
a potential energy landscape as given in Figure II.3.45. 
Imagine the particle sitting in the origin at the bottom of 
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Figure 11.3.46: The scanning tunnelling microscope. Fixing the 
tunnelling current, fixes the distance between the tip and the 
surface to be scanned. (Source: Flickr.) 


the bowl. Then clearly, if we do not add enough energy 
to overcome the height of the bowl, it will sit there forever, 
since from a classical point of view it is a stable situation. 
However, in the quantum world the problem is different; the 
lowest energy solution for the wavefunction is sketched in 
red and the important point to observe is that it is non- 
zero outside the bowl. In other words, if we square the 
wavefunction we get a large probability to find the particle 
where we expect it, but there is a non-vanishing probability 
of finding the particle outside. There is a small probability 
for the particle to ‘jump’ the wall to the outside world, where 
we might observe it. It jumps a wall of any height as long 
as it is thin enough. 


In more physical terms you may think of a situation where 
a particle is bound (and thus sitting in some potential well), 
but if the well corresponds to a local minimum, then there 
is a (low) probability that the particle will tunnel out of the 
well, meaning that the system decays and emits the par- 
ticle. This is for example what happens with nuclear « 
decay, certain nuclei will spontaneously emit an « particle 


Figure II.3.47: STM surface imaging. Image of scanning tun- 
nelling microscope of a coral of atoms deposited on a surface. 
(Source: IBM.) 


which is in fact just a He nucleus consisting of two protons 
and two neutrons. It is this tunnelling phenomenon that 
explains the extremely — exponentially — small probabili- 
ties that are reflected in the extremely long half-life times 
of certain nuclei. Long means that the process takes place 
with a much lower frequency than the natural frequency f 
that occurs in the state corresponding with the energy E of 
the state, with E = hf. 


A similar situation occurs if one sends a particle to a po- 
tential barrier (a wall) then classical physics may predict 
some energy and momentum transfer during the impact 
by which the particle is stopped or may be reflected, but 
what we never have is that the particle would have a finite 
chance to of moving through the wall (without destroying 
it). And this is exactly what happens in the quantum case, 
where one finds a definite probability of ending up on the 
other side as long as the wall has a finite size. The reflec- 
tion and transition probabilities can be calculated and of 
course add up to one. 
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We discussed a realization of tunnelling currents in chap- 
ter Il.1 for the Josephson junction. Another important ap- 
plication is the scanning tunnelling microscope (STM). The 
working is schematically depicted in Figure 11.3.46. By driv- 
ing a constant small current through the tip to the surface 
one wants to study, the tip will then precisely scan the sur- 
face, all the way down to atomic scales. The tip will never 
touch the surface and the ‘wall’ is provided by the thin in- 
sulating layer of air between the tip and the surface. The 
images taken by the microscope of the surface localizes 
the presence of isolated atoms or molecules on the sur- 
face. A nice example is given in Figure 11.3.47. The STM 
scans the contour of the charge density density profile on 
the surface. People can be stopped by virtual walls, but 
walking through a real wall is quite something else, and 
that is what quantum particles apparently can do. 


Further reading on interference: 


— QED: 
The Strange Theory of Light and Matter 
Richard P Feynman Antony Zee 
(revised version) 
Princeton University Press (2014) 


— Quantum Interference and Coherence: Theory 
and Experiments 
Zbigniew Ficek and Stuart Swain 
(Springer) (2005) 


Chapter II.4 


Teleportation and computation 


Entanglement and teleportation 


The Einstein—Podolsky—Rosen paradox 


In 1935 Albert Einstein, Boris Podolsky and Nathan Rosen, 
confronted quantum physics with a profound objection con- 
cerning the quantessential property of entanglement. This 
led to a fierce debate between Bohr and Einstein closely 
followed by Schrödinger. In those days the problem was 
presented as a gedanken experiment involving a pair of 
spins or qubits which are entangled but widely separated 
in space. One may think of a spin-less particle at rest (a 7% 
particle for example) decaying into two photons, because 
of momentum conservation both particles will fly off back 
to back and because of spin conservation the polarizations 
of the two photons have to be opposite. This means that 
without interactions the particles could separate and travel 
a long way, and we could imagine that one might arrive in 
New York and the other in Tokyo where Alice and Bob will 
make polarization measurements. The polarization state 
of the entangled pair is given by: 


(11.4.1) 


1 
= 1,—1 1j; 
PNT) zl =l )) 
where the first entry refers to the NY particle and the sec- 
ond to its Tokyo counterpart, and we for convenience have 
assumed the particles to be polarized in z-direction. Now 


Figure II.4.1: The Myth of Depth, a 1984 painting by Mark 
Tensey. It makes you think of unusual, if not magical, ways in- 
formation may propagate. It is the ‘Spooky action at a distance, 
Einstein was so worried about.(Source: ANP / Mark Garlick / 
Science Photo Library) 


Alice in New York decides to make a polarization measure- 
ment. Let us suppose that she chooses to do this along 
the x-axis, and let us also suppose that she finds a value 
+1. Then we know that the first spin is projected on the 
|+) state. But as the spins are opposite it follows that in- 
stantaneously the spin of the particle in Tokyo must have 
changed to the |—) state. That this indeed has to be the 
case follows from the fact that we could have written the 
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initial state also in the form 


1 
Flot 


and Alice’s projects on the first term as we discussed in the 
previous section, so after Alice’s measurement we have 
Wnt > |+ .—) . If Bob also decides to measure along the 
x-axis, then he will obtain the value —1. It is clear that the 
probalities for measurement outcomes can be precisely 
calculated for all possible independent choices that Alice 
and Bob could make. 


nt) = 


| +.—)), 


There is a lot at stake in this proposed experiment and in 
the early days was it was too hard to perform. If the cal- 
culated and observed distributions would not match, then 
quantum theory would be in deep trouble, not to say falsi- 
fied! As we will discuss later, starting in the 1980s, such 
experiments became feasible, and in fact unambiguously 
confirmed the quantum predictions. 


Quantum key distribution. The above observations al- 
low for a quite simple protocol to securely share a digital 
key, called the BB84 protocol, which was invented by Gilles 
Brassard and Charles Bennett in 1984, opened a research 
field in quantum informatics called quantum cryptography. 
Their protocol benefits from the fundamental principles of 
quantum mechanics and enables secure communication 
between parties. Nowadays, their protocol is commercially 
available and forms the core of many other protocols on 
quantum cryptography and quantum information in gen- 
eral. Brassard and Bennett shared the prestigious Break- 
through Prize in Fundamental Physics 2023 with David 
Deutsch and Peter Shor.! The Shor quantum algorithm 
for prime factorization will be discussed in the next sec- 
tion on quantum computation. The protocol is illustrated in 
Figure ll.4.4. Alice and Bob take a large sequence of mea- 
surements on (in this case) parallel polarized entangled 


'The Breakthrough Prize in Fundamental Physics is one of the 
largest prizes in science — both qua money and prestige — and was 
founded in 2012 by Yuri Milnor. 


D1 D2 
Source 


Vi 


Coincidence detector 


Figure Il.4.2: The Einstein-Podolsky-Rosen (EPR) paradox. 
Two particles are created in a polarization entangled state, and 
a measurement outcome on the left particles completely deter- 
mines the probabilities for the measurement outcomes on the 
right in any frame. The coincidence detector is there to make 
sure that measurements on members of the same pair are com- 
pared. 


pairs and make a list of their sequence of polarizer settings 
and their outcomes. Afterward they may exchange the se- 
quences of their polarization settings. If they now select 
the outcomes for the pairs where the setting was identical, 
then the outcomes must be the same, therefore this re- 
stricted sequence represents a shared digital code quan- 
tum computation as may be verified from the figure. 


If one imagines an eavesdropper Eve somewhere measur- 
ing one of the photons, she cannot copy it and resend it. 
This means that the observed code that Alice and Bob ob- 
serve will no longer coincide. So they can check whether 
their communication channel is secure. Clearly Eve can- 
not extract any key from her observations. 


Is causality violated? Einstein’s first worry was that this 
instantaneous non-local consequence of the act of mea- 
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p,(-1) =1 


p,(+1) = 1/2 Ware ®ve: [+> 
IWnt> ~ 


lyp = |-,+> o—> 
p,(+1) = 1 


Py(-1) = 1/2 lyo = l-+> 


lynr = (IL-1>- |-1,1>)/v2 


Figure ll.4.3: EPR schematic. The measurement scheme for a 
particularly simple choice of measurements in the EPR experi- 
ment. The pair is created in the state |p} with opposite polariza- 
tion in the z-frame. Alice in New York measures in the x-frame, 
so she finds outcome +1 with equal probability px (+1) = 1/2. 
If Bob in Tokyo subsequently also measures in the x-frame, his 
outcome, according to quantum theory is completely fixed. 


surement of one of the particles of the entangled pair, 
would violate causality. Some information about Alice’s 
measurement outcome appears to have been transmitted 
instantaneously to Tokyo, which means that it had to travel 
with a velocity exceeding the speed of light. And that is a 
no-go in Einstein’s relativity! 


So, the first task is to actually prove that the correlations 
between the measurement outcomes would necessarily 
require the transfer of information faster than light. If so, 
this would mean that such pairs could be used to trans- 
mit information faster than light, which in turn would imply 
the breakdown of special relativity in particular and of our 
cherished notion of causality in general. 


The question should be: what can Bob learn from Alice 
making a measurement? As a matter of fact, the answer is: 


Alice - New York 


VOWGOGOHOS 


A I 


Bob - Tokyo 


DPOPDJVDPP 


Figure 11.4.4: Quantum key sharing. Using a sequence of par- 
allel entangled photons for key distribution through the BB84 
protocol. On top in green is the sequence of polarizer choices 
that Alice made and in the second line her measurement out- 
comes. In the red box we give the sequence of Bob and his 
outcomes. What we know for sure is that when the members 
of a pair are measured in the same polarization frame, the out- 
comes should be identical. And indeed, if we cross out the mea- 
surements where the frames are different, we are left with two 
identical sequences. If this happens not to be the case, Alice 
and Bob know that an eavesdropper is active somewhere. 


nothing at all, at least as long as he doesn’t know what the 
polarization axis is that she has chosen for her measure- 
ment, and what the outcome of her measurement was. But 
she can only inform him about that by conventional means 
using subluminal velocity media like email or Facebook. 
So this form of information sharing does not violate causal- 


ity. 


Hidden variables and local realism. The proposition of 
the EPR trio was that quantum theory, which clearly was 
in accordance with all available observations, was maybe 
not really wrong but at least incomplete. The paradox fur- 
thermore implied that once completed the theory would not 
need these ‘spooky’ instantaneous non-local kind of inter- 
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actions. Any physically sound theory should obey the prin- 
ciple of local realism. Local realism maintains that each 
of the particles is always in a definite polarization state all 
along, but it just happens to be so that we don’t know which 
state that is. The state is always completely determined 
but we don’t know how it is fixed. Maintaining local real- 
ism would be possible if you say that the highly correlated 
nature of the outcomes could be a manifestation of ordi- 
nary statistics caused by the existence of certain hidden 
variables, which would cause such correlations. The need 
for that strange, non-local, instantaneous ‘action at a dis- 
tance’ could be avoided if one knew these hidden variables 
and would measure them. In other words, Einstein was 
not arguing about the predictions of quantum theory per 
se, but the proposed probabilistic formalism would only be 
part of the story — a kind of effective description of nature, 
and not a fundamental ingredient of the resulting complete 
theory. His proposal would turn the fundamental indeter- 
minacy of quantum theory merely into a lack of knowledge 
about the set of state variables. A fundamentally unde- 
termined state would just become a fully determined, but 
unknown state. 


This line of reasoning caused a rather deep controversy 
about the measurement problem and the interpretation of 
quantum theory. Because the tremendous successes of 
quantum theory continued to unfold, this Einstein—Bohr de- 
bate lingered on somewhat in the margins as a kind of pas- 
time for philosophers of science, until in 1964 John Stew- 
ard Bell, a British physicist working at the CERN accelera- 
tor center in Geneva, made the groundbreaking discovery 
that there are situations where quantum theory would di- 
rectly contradict the local realist predictions. Bell turned 
Local Realism into a falsifiable hypothesis! The question 
was to set up a true EPR experiment and precisely mea- 
sure the correlations between the measurement outcomes 
for the entangled pairs. Bell’s proposal moved the question 
out of the realm of abstract epistemology into that of exper- 
imental physics. This deep question allowed for a definite 
answer. This is the subject of the next section. 


The Bell inequalities 


The discussion of Bell is about the EPR pairs and the mea- 
surements illustrated in Figure Il.4.5. The question is in- 
deed whether a hidden variable theory could ever account 
for the data as predicted by quantum theory. Is there a de- 
terministic scheme which respects local realism that per- 
fectly mimics the quantum theory and the measurements 
on entangled states? The difficulty is in some sense to 
produce the extremely strong instant correlations between 
measurement outcomes that quantum mechanics allows, 
even if the particles are far apart. 


The correlator. John Bell devised an experimental test ex- 
actly based on these correlations. To stay in the language 
of the previous section, Bell proposed to consider the av- 
erage of the product of measurement outcomes of Alice 
and Bob P(a, b) where a and b are the (real) frames of 
Alice and Bob as depicted in Figure I|.4.5(d). If we imagine 
that they both choose the same polarization, one finds for 
example that: 


P(a, a) = —1 and P(a,—a) = 1, (11.4.2) 


because if they have the same frame the measurement 
outcomes will be opposite and the product becomes mi- 
nus one. If the polarizers a and b are in the same di- 
rection but oriented oppositely, they both measure +1 and 
thus the correlator is plus one. Now it is clear that to calcu- 
late the correlator P(a, b) in general for the quantum case, 
we just have to look at the figure, where we learn that if 
the angle between the frames of Alice and Bob is B and 
Alice measures +1 then Bob measures +1 with proba- 
bility pp(+1) = sin? B. This is consistent with equation 
(11.4.2), because P(a, a) = — cos 0 = —1 and P(a,—a) = 
—cosm = 1, and similarly py(—1) = sin? ß. And if Al- 
ice measures —1 then also the probabilities p,(+1) get 
interchanged. From these consideration one obtains the 
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(a) The electron-positron pair is produced in some frame os in the anti- (b) The electron-positron pair state in the frame oa = 0z of Alice in New 
symmetric entangled state |\pnt) = alll, 1);—|—1,1)s), whichis York. The antisymmetric form is preserved under rotations, and we just 
represented by the double arrow. replace the subscript s with a. 


À= 


(c) The spin measurement in oa frame of Alice in New York. She mea- (d) After Alice’s measurement the Tokyo component collapses to |r) = 
sures the eigenvalue Aq = +1, and projects the New York component on | — 1)a, from which the probabilities for the measurement outcomes in 
the | + 1) state. the o» of Bob follow. 


Figure II.4.5: The Einstein—Podolsky—Rosen paradox. (a) A neutral particle decays into an entangled electron-positron pair; these 
travel in opposite directions to New York and Tokyo and have oppositely polarized spins in some frame. Alice and Bob make subse- 
quently measurements in frames they may choose independently and each will measure an outcome +1 . The sequence of subfigures 
explaines that the final probability for Bob is sin? B to find +1 , and cos? ß to find —1 . These probabilities depend on Alice’s choice 
and are instantly fixed after Alice has made her measurement. 
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following formula for P(a, b): 


P(a,b) = V(1(sin’ B — cos? B) — 1(cos* B — sin? B)) 


sin? B — cos? B = — cos 2B. (11.4.3) 


Introducing hidden variables. To describe the measure- 
ments in hidden variable theory we can introduce two func- 
tions A(a, à) and B(b,A) representing the measurement 
outcomes of Alice and Bob respectively, which are strictly 
local in the sense that they only depend on their own mea- 
surement frame, and now also on a hidden variable A. A 
value for this variable is typically set at the moment when 
the particles are produced and that value stays fixed for 
both in the absence of interactions. Given A and a choice 
of frame a, the outcome A(a, A) is fixed. The question 
is whether there exist such functions that reproduce the 
quantum results of equation (Il.4.3). Here Bell brilliantly 
rephrased the question. Instead of just trying to directly 
prove or disprove the existence, he derived a condition (in 
fact a bound or inequality), which any hidden variable the- 
ory would have to satisfy under quite general assumptions, 
and subsequently showed that quantum theory allows for 
ample situations where this condition would be violated. 
Answering the question was now reduced to performing 
certain experiments and seeing whether the results would 
violate the inequality or not. If they do not, hidden vari- 
ables would be a viable option, but if they do, that would 
be the demise of the theory of hidden variables and local 
realism! 


The Bell inequality. Let us first agree that A and B can 
only equal +1 , because they are measurement outcomes. 
The only thing we assume about A is that it can take certain 
values with a probability w(A) , where we have to require 
that w(A) > 0 and £,w(A) = 1. The classical ‘local re- 
alist’ correlator P,,(a,b) is then defined as the weighted 
sum: 


Pi-(a, b) = Lyw(A)A(a, A)B(b,A). (11.4.4) 


For the case where the frames are equal we obtain the 
equality A (a, A) = —B (a, A) . To get the required inequal- 
ity Bell introduced an arbitrary third frame c and consid- 
ered the expression: 


Pu (a, b) TN Pir(a, c) 
= —Lyw(A)[A(a,A)A(b, A) — A (a, AJA (c, AJ] 
= —L,w(A)[T — A(b, AJA (c, A) (Ala, AJA(b, A) , 


where we have multiplied the second term in the first line 
with A(b, A)* = 1 and taken out an overall factor equal to 
the first term. This yields the second line, where we have 
a factor in square brackets and one in parenthesis. The 
factor in square brackets is always larger or equal to zero, 
whereas the factor in parenthesis is either plus or minus 
one, and may depend on A. The sum over A may be over 
terms with alternating signs. Therefore, if we plainly set 
all these signs to minus one, then the right-hand side is a 
sum of only positive terms and the result is larger or equal 
than the right-hand side of the equation as it stands. And 
that is where the inequality comes in, we obtain a bound 
for the absolute value of the left-hand side: 


|Pir(a, b) — Pir(a,c)] < Zawa) —A(b, A)A(c,A)I, 
(11.4.5) 
which yields the Bell inequality: 


IPir(a, b) — Pir(a,c)| < 1 + Pir(b,c) . (11.4.6) 


We see that the inequality involves three classical cor- 
relators and three frames that can be chosen indepen- 
dently. 


Quantum violates the bound. The fundamental issue 
is now whether we can arrange a set of quantum mea- 
surements that yield correlators that may violate this in- 
equality. If we succeed, those measurement outcomes 
could not have been obtained from a theory with hidden 
variables. It is not hard to find a simple example, let us 
return to Figure ll.4.5(d) for which we already calculated 
that P(a,b) = —cos2f. Let us choose a = Z,b = X, 
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and c = (X + Z)/\/2 right in between a and b, then we 
obtain P(a,b) = —cos5 = 0 and P(a,c) = P(b,c) = 
—cos | = —0.71. Substitution of these numerical values 
in equation (II.4.6) shows that the inequality is violated in- 
deed: 0.71 £ 0.29! 


In conclusion we may say that quantum theory is clear 
about what to expect, and the really big question was to 
‘just perform’ such experiments. And that is what we turn 
to next. 


Hidden no more 


The history of EPR experiments performed since Bell pub- 
lished his inequalities is interesting on its own, because it 
was immensely hard to actually do the experiment in a way 
that would satisfy all critics. Indeed, as the stakes were so 
high all experiments were analysed with the highest con- 
ceivable level of scientific scrutiny. 


There were always new loopholes that the experimenters 
had to try and eliminate, and probably there always will 
remain some far-fetched loopholes for example question- 
ing whether the experimenters have a free will to really 
choose the settings randomly etc. Fortunately, over the 
last few decades impressive progress has been made, and 
experiments have so much improved that it appears that 
Einstein-Bohr debate is finally settled and that local re- 
alism seems no longer a tenable alternative for quantum 
theory. 


And it is for that reason that only in 2022 the achievements 
were given the highest degree of recognition as the Nobel 
prize was awarded to three pioneers who successively de- 
veloped the experimental set-ups that provided full proof 
evidence that the hidden variable theories implementing 
local realism were no longer feasible. The award went to 
Frenchman Alain Aspect, the American John F. Clauser 
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Figure ||.4.6: The Delft Experiment. The setup of the 2015, 
loophole-free Bell inequality violation experiment, at Delft Tech- 
nological University. The measurement stations A and B are 1.7 
km apart, ensuring that the measurements are indeed spacelike 
separated and causally disconnected. (R. Hanson et al. Nature, 
Vol 526, 2015) 


and the Austrian Anton Zeilinger, ‘for experiments with en- 
tangled photons, establishing the violation of Bell inequali- 
ties and pioneering quantum information science.’ 


The Delft experiment. One of the more recent experi- 
ments is the ‘loophole-free Bell inequality violation exper- 
iment’ performed in 2015 by Ronald Hanson’s group at 
the Delft Technological University in the Netherlands. It 
uses two electron spins in the maximally entangled anti- 
symmetric state 


We sketched the setup of the experiment in Figure II.4.6. 
It involves three stations A, B, and C. In A and B two elec- 
trons are prepared in the entangled state. First each of 
them emits a photon so that the photon and electron are 
entangled. The photons are then sent through an optical 
fiber to station C, where they are measured in a clever way 
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so that that measurement can be used to verify that the 
electrons are indeed in the desired entangled state given 
above. This verification of the state to be measured is one 
of the loopholes that has weakened earlier attempts to cor- 
ner the hidden variables option. The entangled electrons 
enter measurement devices in A and B, where indepen- 
dently a random choice between two distinct polarization 
directions is made for each of them. In A one chooses the 
observable a equal either Z or X , and in B the observable 
b being either (—Z + X)/ v2 or (—Z — X)/V2. The sta- 
tions A and B are 1.7 km apart, and therefore the choices 
and measurements are space-like separated, implying that 
there can’t be any causal relation between them. This is 
indicated in the figure where the future light cones of the 
random choice events and measurement events at A and 
B are drawn, and one sees that they are outside each 
other’s future light cones indeed. And this was another 
loophole that hampered earlier experiments. So this ex- 
periment really closes both the preparation and locality 
loopholes simultaneously and that leaves little room for the 
hidden variables scenario to survive. Again, one can cal- 
culate a bound on a weighted average S of the product of 
measurement outcomes x in A and y in B where, 


S=|)} (hla 8 bhp)|. 


ab 


The classical bound respecting local realism can be shown 
to yield S < 2, whereas the quantum value can be calcu- 
lated giving S = 2V2 ~ 2.83. The highly sophisticated 
2015 Delft experiment of Ronald Hanson et al. measured 
a total of 245 trials over a period of 18 days; this yielded an 
average value 2.42 with a standard deviation of 0.2. 


Conclusion. We conclude that spooky action at a dis- 
tance is just there and we have to live with it. Quantum 
weirdness is not fake; it is rock solid! It turns out to be 
a blessing in disguise, because it implies the spectacular 
possibility of quantum teleportation, to which we will turn 
after we have described a second experiment that also re- 
futes the idea of local realism. 


A decisive three photon experiment 


There is one more experiment on entangled states that | 
like to describe in some more detail. It is a wonderfully 
conceived and designed experiment, which in a sense is 
so clean and therefore easy to understand, that | think it re- 
ally gave a final blow to the idea of local realism and hidden 
variables. It is called the Greenberger—Horne—Zeilinger 
or GHZ experiment? and involves three (in fact even four) 
photons in a maximally entangled state. At first makes it 
may look dauntingly complicated, but the prediction is so 
radically unambiguous, and the reasoning so straightfor- 
ward that it really is a litmus test on the matter of local 
realism. The answer is a clean yes or no, and does not 
involve a bound that has to be violated. In this experi- 
ment the outcomes predicted by the quantum hypothesis 
on the one hand and local realism on the other are mutu- 
ally exclusive and that makes this experiment so powerful 
and attractive. It brings the inner workings of quantum the- 
ory to the surface. The results unambiguously prove the 
existence of entanglement and therefore of quantum non- 
locality. 


To give you an idea of the experimental setup, we have 
reproduced the schematic in Figurell.4.7._ From a pho- 
ton source maximally entangled pairs are generated, each 
member goes through a beamsplitter and we end up with 
basically four entangled photons. One of the photons is 
used as a trigger, and if the four detectors fire simultane- 
ously, one knows that the three entering in the three main 
detectors are in a maximally entangled GHZ-state. These 
three photons can be analyzed in detectors det 1, det 2 
and det 3. The detectors are space-like separated, mean- 
ing that the measurements cannot influence each other in 
a causal way, and they are designed such that you can 


The setup of the experiment was introduced in a paper in 1989 by 
Greenberger, Horne, and Zeilinger and the experiments were carried 
out by a European collaboration of Pan, Bouwmeester, Danielli, Wein- 
furter and Zeilinger in 2000 (Nature, Vol 403, 2000). 
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Figure 1I.4.7: The GHZ experiment. This exploits three en- 
tangled particles to unambiguously demonstrate that quantum 
theory violates local realism, thereby closing the door on the fa- 
mous Bohr-Einstein debate. (Source: Nature, Vol 403, 2000) 


switch between three different polarization bases, in par- 
ticular the X-basis with eigenstates |+) and the Y-basis 
with eigenstates |L/R) , and the Z-basis with eigenstates 
| + 1), the measurement outcomes can be either +1 or 
—1. If the detectors just are in the Z-basis, you can see 
how the entangled state is actually prepared by the array 
of beam splitters and the A/2 wave plate. The criteria for 
data selection is (i) that the trigger (detector) selects the 
events with A, = —1 and (ii) that indeed all four detectors 
pick up a simultaneous signal. These criteria can only be 
met in two distinct cases which we have depicted in the 
two figures II.4.8.° 


Let us now analyze the quantum predictions for the exper- 
iment which starts with the three-photon GHZ state: 

o 
v2 


3To be precise detector det 3 is turned 60 degrees to invert the read- 
out (= 1 © —1). 


hp) (+1,+1,+1)+|—1,—1,—1). (11.4.7) 


Figure 11.4.8: Contributions to GHZ. The diagrams show the 
only two possible contributions to the three (or four) photon en- 
tangled state with trigger on -1, toa ZZZ measurement. (Source: 
Nature, Vol 403, 2000) 


We can now express this state in various different bases, 
and GHZ proposed to study a sequence of four measure- 
ments with the detectors det 1, det 2 and det 3 in the fol- 
lowing order first the cyclic variations YYX, YXY, XYY and 
finally an XXX measurement. Knowing the result of the 
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first three measurements both the quantum-adepts and 
the local realist followers can take that data, turn their re- 
spective cranks and come out with a unique prediction for 
the possible outcomes of the fourth experiment and their 
probabilities. The beauty of this experiment is that oppos- 
ing schools of thought come out with mutually exclusive 
predictions! So it is a real ‘yes or no’ for quantum versus 
local realism. 


So let us see how the quantum analysis goes, and it is ba- 
sically what we have been doing before only a little more 
of it. To determine the various possible measurement out- 
comes and the probabilities we have to rewrite the GHZ- 
state in the other bases, and because we know the linear 
combinations this is a matter of making the appropriate 
substitutions in the expression (II.4.7). 


1 
1 
-1) = TR 
1 
+1) = ae 
-1) = ull) — IR). 


So for example in the YYX experiment we would encounter 
the state: 


let us make some observations on this state. The proba- 
bility of finding a +1 or —1 result for any of the three pho- 
tons is 50% meaning that it is maximally random: it is like 
throwing with a fair coin. Next note that the outcomes of 
each possible pair out of the three photons also has equal 
probabilities: so say for the first two detectors one has the 
that the possible outcomes (+1, +1), (+1,—1), (—1,4+1), 
and (—1,—1) , each occur with 25% probability. Finally it is 
also clear that given the outcome of two of the measure- 
ments the third is completely fixed. If the first two give LR 


Quantum Theory 


Local realism 


Experiment 


Pea 
0.2 
BESE 


Figure II.4.9: The decisive result. The predictions of quantum 
theory (top) and local realism (middle) for the outcome of the 
XXX experiment are mutually exclusive. The experiment (bot- 
tom) strongly favors quantum theory. (Source: Nature, Vol 403, 
2000) 
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or (—1, +1), the third detector would have a +, meaning 
an outcome —1 for the product of the outcomes of det 7, 
det 2 and det 3. It is clear that exactly half of the possi- 
ble 2? = 8 possible outcomes will occur in this experiment, 
and this selection is an expression of the correlations that 
quantum theory produces. And of course the same holds 
for the other three experiments in the sequence of four. In- 
deed for the fourth XXX experiment, we should express 
the state in the XXX-basis, which yields: 


hb) = SI B+ tas += +, >) oe ae a 
(11.4.9) 
The thing to note here is that the product of the measure- 
ment outcomes of the three detectors will always be +1, 
whatever the component of equation (11.4.9) is that hap- 
pens to occur. 


Let us now do the analysis following the local realism line 
of reasoning. The idea is that the setup is such that there 
is no causal relation between them. This means that each 
of the photons should carry an element of reality for both 
the X and Y measurements, telling us what the outcome of 
such a measurement would be. Let us call these elements 
which are just numbers, x; and y; where i = 1,2,3 labels 
the detector, where these can only equal +1. If we now 
look at the possible outcome of the XYY measurement and 
its permutations, each of the photons can carry only one 
particular x; and yi, which should fit all three possibilities 
in (l.4.8). This leads for the first three measurements to 
the three equations: 


yry2x3 = —1 ; yix2y3 =—1; x1y2y3 =—1. (I-4.10) 
The neat thing is that the solution of these three equations 
completely fixes the product x1x2x3 , which then of course 
is the local realism prediction for the outcome of the fourth 
(XXX) measurement. If we take the product of the three 
equations (II.4.10), we get that: 


(yryoxs) (yixz2ys)(x1y2y3) = (x1x2x3)ytysy3 = —1. 


With the squares of the y; being +1 , we get the prediction 
X1x2x3 = —1. This answer is exactly opposite to the quan- 
tum prediction following from equation (11.4.9), which as we 
already mentioned, gives for the product x;x2x3 = +1! If 
we go back to the 8 possible measurement outcomes for 
the XXX experiment, this would lead to what is depicted in 
Figure Il.4.9, for the predictions, and the actual measure- 
ment outcome of the experiment showing extremely strong 
support for quantum theory. 


Quantum teleportation 


Quantum teleportation provides a method for privately send- 
ing messages in a way that ensures that the receiver will 
know if anyone eavesdrops. This is possible because a 
quantum state is literally teleported, in the sense of ‘beam 
me up Scotty’: A quantum state is destroyed in one place 
and recreated in another. Because of the no-cloning the- 
orem that we discussed on page 298 of Chapter ll.2, it 
is impossible to make more than one copy of this quan- 
tum state, and as a result when the new teleported state 
appears, the original state must be destroyed. Further- 
more, it is impossible for both the intended receiver and 
an eavesdropper to have the state at the same time, which 
helps make the communication secure. 


Quantum teleportation takes advantage of the correlation 
between entangled states as discussed in the previous 
sections. Suppose Alice wants to send a secure message 
to Charlie at a (possibly distant) location. The process of 
teleportation depends on Alice and Charlie sharing differ- 
ent qubits of an entangled state. Alice makes a measure- 
ment of her part of the entangled state, which is coupled 
to the state she wants to teleport to Charlie, and sends 
him some classical information about the entangled state. 
With the classical information plus his half of the entan- 
gled state, Charlie can reconstruct the teleported state. 
We have indicated the process in Figure lI.4.10. We fol- 
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y 


Figure ll.4.10: Quantum teleportation. Teleportation of a quan- 
tum state using an entangled pair, as proposed by Bennett et al. 
in 1993. An explanation is given in the text. 


low the method proposed by Bennett et al. in 1993, and 
first realized in an experimental setup by Zeilinger’s group 
in 1997. In realistic cases the needed qubit states are typi- 
cally implemented as left- and right-handed polarized light 
quanta (i.e. photons). 


The simplest example of quantum teleportation can be im- 
plemented with three qubits. The (A) qubit is the unknown 
state to be teleported, 


hpa) = all) + BI- 1). (11.4.11) 


This state is literally teleported from one place to another. 
If Charlie likes, once he has the teleported state he can 
make a quantum measurement and extract the same in- 
formation about « and B that he would have been able 
to extract had he made the measurement on the original 
state. 


The teleportation of this state is enabled by an auxiliary 
two-qubit entangled state. We label these two qubits B 
and C. For technical reasons it is convenient to represent 


(11.4.12) 


The process of teleportation can be outlined as follows 
(please refer to Figure 11.4.10). 


1. Someone prepares an entangled two-qubit state BC 
(the Entangled pair in the diagram). 


2. Qubit B is sent to Alice and qubit C is sent to Charlie. 


3. Inthe Scanning step, Alice measures in the Bell states’ 
basis the combined wavefunction of qubits A (the 
original in the diagram) and the entangled state B , 
leaving behind the Disrupted original. 


4. Alice sends two bits of classical data to Charlie telling 
him the outcome of her measurements (Send clas- 
sical data). 


5. Based on the classical information received from Al- 
ice, Charlie applies one of four possible operators to 
qubit C (Apply treatment), and thereby reconstructs 
A, getting a teleported replica of the original. \f he 
likes, he can now make a measurement on A to re- 
cover the message Alice has sent him. 


We now explain this process in more detail. In step (1) 
an entangled two-qubit state gc such as that of equa- 
tion (11.4.12) is prepared. In step (2) qubit B is transmitted 
to Alice and qubit C is transmitted to Charlie. This can 
be done, for example, by sending two entangled photons, 
one to each of them. In step (3) Alice measures the joint 
state of qubit A and B in the Bell states’ basis, getting two 
classical bits of information, and projecting the joint wave- 
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function Wap onto one of the Bell states. The basis of 
Bell states has the nice property that the four possible out- 
comes of the measurement have equal probability. To see 
how this works, for convenience suppose the entangled 
state BC was prepared in state wee). In this case the 
combined wavefunction of the three-qubit state is 


hbapc) = hba) We) = 


S (1a)lts)1— 1c) — a)l ~ To)lle)) + 
L- w) 1c)—I— 1a) — 1gB)lc)). (11.4.13) 


If this is expanded in the Bell states’ basis for the pair AB, 
it can be written in the form: 


asc) = 

5 Rac) — BI—1¢)) 

+ WY) (—altc) + Bl —Te)) 
IO) (Ble) + al— 1c)) 

+ ORR (Bc) + ee —1c))] - 


(11.4.14) 


We see that the qubit pair AB has equal probability to 
be in the four possible states w), yy , Jor?) and 
oe 

AB) * 


In step (4), Alice transmits two classical bits to Charlie, 
telling him which of the four basis functions she observed. 
Charlie now makes use of the fact that in the Bell basis 
there are four possible states for the entangled qubit that 
he has, and his qubit C was entangled with Alice’s qubit B 
before she made the measurement. In particular, let |dc) 
be the state of the C qubit, which from equation II.4.14) is 
one of the four states: 


w= (3G): (2) 


In step (5), based on the information that he receives from 
Alice, Charlie selects one of four possible operators Fi and 


uses it to measure the C qubit. There is one operator Fi 
for each of the four possible Bell states, which are respec- 
tively: 


1 0 —1 0 0 1 0 1 
FG) FG oli o)- 
(11.4.15) 
Provided Charlie has the correct classical information and 
an intact entangled state he can reconstruct the original 
A qubit by measuring |c) with the appropriate operator 
Fi. 
hba) = a1) + BI- 1) = Filc). 


By simply multiplying each of the four possibilities it is easy 
to verify that as long as his information is correct, he will 
correctly reconstruct the A qubit ~|1,) + BI— 1a). 


(11.4.16) 


We stress that Charlie needs the classical measurement 
information from Alice. If he could do without it the tele- 
portation process would violate causality, since information 
could be transferred instantaneously from Alice to Charlie. 
That is, when Alice measures the B qubit, naively it might 
seem that because the B and C qubits are entangled, this 
instantaneously collapses the C qubit, sending Charlie the 
information about Alice’s measurement, no matter how far 
away he is. To understand why such instantaneous com- 
munication is not possible, suppose Charlie just randomly 
guesses the outcome and randomly selects one of the four 
operators F; . Then the original state will be reconstructed 
as a random mixture of the four possible incoming states 
bc) . This mixture does not give any information about the 
original state |pa) . The same reasoning also applies to a 
possible eavesdropper, conveniently named Eve. If she 
manages to intercept qubit (C) and wants ‘to measure it’ 
before Charlie does, without the two bits of classical infor- 
mation, she will not be able to recover the original state. 
Furthermore she would affect that state. If Charlie some- 
how gets the mutilated state, he will not be able to recon- 
struct the original state A . Security can be achieved if Al- 
ice first sends a sequence of known states which can be 
checked by Charlie after reconstruction. 
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Superposition The strange thing 
P about a qubit in comparison with its 
digital precursor is the fact that it can 
be in a state that is a ‘superposition’ of 

the ‘1’ and the ‘0’ state. This is possible because 
of the all-important linear superposition principle 
which is a basic ingredient of quantum theory. As a 
consequence of quantum information processing, 
the manipulation of qubits, i.e. changing their 
states by having them interact, is like doing parallel 
processing on a large scale. The exceptional 
power of the quantum computers of the future is a 
reflection of the ability to directly work with these 
linear superpositions. Here is an analogy that may 
help you understand why this is so. Imagine you 
would like to make a street map of a city to find the 
shortest route from point P on one side of town to 
point Q on the opposite side. As a single being you 
go and walk in the right direction, and to find the 
shortest route you should walk in principle all the 
possible routes that bring you from P to Q and com- 
pare their lengths. Parallel processing would mean 
that you hire a bunch of students to independently 
and simultaneously take different paths from P to 
Q. This certainly will save time. But now imagine 
that some Dr Ghetto Blaster comes along with a 
device which produces lots of sound at point P and 
his business partner Dr Ghetto Digest sits down 
at point Q with an impressive highly sophisticated 
listening device. He turns the machine on and in 
no time has reconstructed the street map. Imagine! 
The remarkable thing is that this is in principle 
possible because sound as an agent always takes 
all possible paths through town simultaneously, and 
interferes with itself on every corner, and all that 
information is encoded in the changes of the signal 
that we would receive in Q. It probes the street 
plan not sequentially but in parallel. A fashionable 


version of this story is to say that you can hear the 
shape of a remote drum if somebody is playing it, or 
that you can hear the shape of a tin roof by listening 
to the rain pouring on it. This is so because there 
are many ticks and every tick in a sense ‘contains’ 
all frequencies and therefore these examples are 
classical analogues and show the potential power 
of the linear superposition principle. L 


If the original and reconstructed sequence are perfectly 
correlated, then that guarantees that Eve is not interfer- 
ing. Note that the no-cloning theorem is satisfied, since 
when Alice makes her measurement she alters the state 
Wa as well as her qubit B. Once she has done that, the 
only hope to reconstruct the original pa is for her to send 
her measurement to Charlie, who can apply the appropri- 
ate operator to his entangled qubit C. 


The quantum security mechanism of teleportation is based 
on strongly correlated, highly non-local entangled states. 
While a strength, the non-locality of the correlations is also 
a weakness. Quantum correlations are extremely fragile 
and can be corrupted by random interactions with the en- 
vironment, i.e. by decoherence. As we discussed before, 
this is a process in which the quantum correlations are de- 
stroyed and information gets lost. The problem of deco- 
herence is the main stumbling block in making progress 
towards large-scale development and application of quan- 
tum technologies. Nevertheless, the research group of 
Gisin et al. at the University of Geneva demonstrated tele- 
portation over a distance of 550 meters using the optical 
fiber network of Swisscom in 2006. 


A n important next step would be the construction of a net- 
work of quantum devices with links along which entangled 
states can be created and quantum information teleported 
securely. In 2022 the first successful steps were reported 
by the QuTech group of Hanson in Delft. 
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Figure II.4.11: Trapped ions. lons trapped in a linear optical 
lattice. (IQO Insbruck) 


Quantum computation 


Quantum computation is performed by setting up controlled 
interactions that cause non-trivial dynamics and succes- 
sively couple individual qubits together and generate a time 
evolution of the quantum state in a predetermined man- 
ner. And moreover ensuring that no other interactions take 
place that could corrupt the computation. A multi-qubit 
system is first prepared in a known initial state, represent- 
ing the input to the program. Then interactions are switched 
on by applying forces, such as magnetic fields, that deter- 
mine the direction in which the wavefunction rotates in its 
state space. Thus a quantum program is just a sequence 
of unitary operations that are externally applied to the initial 
state. This is achieved in practice by a corresponding se- 
quence of quantum gates. When the computation is done 
measurements are made to read out the final state. Mea- 
surements are non-unitary operations that can also be part 
of the process. 


Quantum computation is essentially a form of analog com- 


putation. A physical system is used to simulate a math- 
ematical problem, taking advantage of the fact that they 
both obey the same equations. The mathematical problem 
is mapped onto the physical system by finding an appropri- 
ate arrangement of magnets or other fields that will gener- 
ate the proper equation of motion. One then prepares the 
initial state, lets the system evolve, and reads out the an- 
swer. Analog computers are nothing new. For example, 
Leibnitz built a mechanical calculator for performing multi- 
plication in 1694, and in the middle of the twentieth century, 
because of their vastly superior speed in comparison with 
digital computers, electronic analog computers were often 
used to solve differential equations. 


Then why is quantum computation special? The key to its 
exceptional power is the massive parallelism at intermedi- 
ate stages of the computation. Any operation on a given 
state works simultaneously on all basis vectors and thus 
also on entangled states. The physical process that de- 
fines the quantum computation for an n qubit system thus 
acts in parallel on a set of 2" complex numbers, and the 
phases of these numbers (which would not exist in a clas- 
sical computation) are important for determining the time 
evolution of the state. When the measurement is made 
to read out the answer at the end of the computation we 
are left with the n-bit output and the phase information is 
lost. 


Because quantum measurements are generically proba- 
bilistic, it is possible for the ‘same’ computation to yield dif- 
ferent ‘answers’, e.g. because the measurement process 
projects the system onto different eigenstates. This can re- 
quire the need for error correction mechanisms, though for 
some problems, such as factoring large numbers, it is pos- 
sible to test for correctness by simply checking the answer 
to be sure it works. It is also possible for quantum comput- 
ers to make mistakes due to decoherence, i.e. because 
of essentially random interactions between the quantum 
state used to perform the computation and the environ- 
ment. This also necessitates error correction mechanisms. 
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Figure 1|.4.12: Optical lattice. Atoms can be manipulated in a 
linear optical lattice. (/QO Innsbruck) 


The problems caused by decoherence are perhaps the 
central difficulty in creating realistic physical implementa- 
tions of quantum computation. These can potentially be 
overcome by constructing quantum systems where states 
are not encoded locally, but rather globally, in terms of 
topological properties of the system that cannot be dis- 
rupted by external (local) noise. This is called topological 
quantum computing. This interesting possibility arises in 
certain two-dimensional physical media which exhibit topo- 
logical order, referring to states of matter in which the es- 
sential quantum degrees of freedom and their interactions 
are topological (see also Chapter III.3). 


Quantum gates and circuits 


In the same way that classical gates are the building blocks 
of classical computers, quantum gates are the basic build- 
ing blocks of quantum computers. A gate used for a clas- 
sical computation implements binary operations on binary 
inputs, changing zeros into ones and vice versa. For ex- 


Figure II.4.13: Gates. Some standard one-bit quantum gates. 


ample, the only non-trivial single bit logic operation is NOT , 
which takes 0 to 1 and 1 to 0. In a quantum computation 
the situation is quite different, because the states of qubits 
live in a two-dimensional Hilbert space and they represent 
complex superpositions of 0 and 1. This was discussed in 
considerable detail in Chapter II.1. 


Single qubit gates. The set of allowable single qubit op- 
erations consists of unitary transformations corresponding 
to 2 x 2 complex matrices U such that UU = 1. The 
corresponding action on a single qubit is represented in a 
circuit as illustrated in Figure II.4.13. 


Some quantum gates have classical analogues, but many 
do not. As we mentioned, the X operator is the quan- 
tum equivalent of the classical NOT gate, and serves the 
function of interchanging spin up and spin down. In con- 
trast, the Z operator rotates the relative phase of the two- 
component wavefunction by 180 degrees and has no clas- 
sical equivalent. 


Let us briefly discuss the typical one-qubit logical gates of 
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Figure Il.4.13. First the NOT gate, 


as we mentioned this is the quantum equivalent of the clas- 
sical NOT gate and acts by interchanging |1) and | — 1). 


The next one is 
1 0 


The P(@) operation is called the phase gate, since it changes 
the relative phase by © degrees. 


The third gate is the so-called Hadamard gate H, 


ie lee 
aS N a) 


which creates a superposition of the basis states: |+1) > 
|+}. In other words it flips between the Z- and the X- 
frames. 


The general purpose of a quantum computer is to trans- 
form an arbitrary n-qubit input into an n-qubit output cor- 
responding to the result of the computation. In principle im- 
plementing such a computation might be extremely com- 
plicated, and might require constructing quantum gates of 
arbitrary order and complexity. 


Universal gate sets. Fortunately, it has been shown that 
the transformations needed to implement a universal quan- 
tum computer can be generated by a simple — so-called 
universal — set of elementary quantum gates, for example 
involving a well-chosen set of one- and two-qubit gates. 
Single qubit gates are unitary matrices with three real de- 
grees of freedom. If we allow ourselves to work with fi- 
nite precision, the set of all gates can be arbitrary well ap- 
proximated by a small, well-chosen set. There are many 
possibilities — the optimal choice depends on the physical 
implementation of the qubits. 


From the perspective of experimental implementation, a 
convenient two-qubit gate to use is the CNOT gate we 
have discussed before, see Figure ll.1.17. The combina- 
tion of the CNOT, the P(7/4) and the Hadamard gate 
forms for example a universal set. 


Shor’s algorithm 


Prime factoring. An algorithm is not an equation; it is 
more like an operational set of steps — a procedure — that 
is guaranteed to lead to a desired result. So it usually 
does involve equations and a mathematical proof. For the 
Shor algorithm the problem is to factor a large number, 
say of about 800 or 1000 digits, into its prime factors, in 
most cases there are just two of them. So we have a 
number N that can being written in a unique way as a 
product of two prime numbers a and b. One way to do 
this is just by trial and error. In fact by checking one af- 
ter the other whether 2,3,5,... is a divisor of the number 
N. And this you may do by a simple subtraction scheme 
a la Euler, where you keep subtracting the candidate di- 
visor and look whether you indeed hit zero. As we have 
argued in Chapter 1.3, such schemes end up being extrav- 
agantly costly in the time it takes to actually factor a really 
big number. That time is significantly longer than the age 
of the universe and that should not surprise you. The one 
thing it makes at least clear is that patience will not suffice. 
The time dependence on N if one uses conventional digital 
computers is typically exponential. The showcase exam- 
ple of why quantum computers are indeed fundamentally 
different, and for a task like this one far superior, is the 
Shor factorization algorithm which is a quantessential al- 
gorithm, because it exploits non-commutativity of opera- 
tors in a clever way. 


The MIT applied mathematics professor Peter Shor pro- 
posed the algorithm in 1994 and was co-recipient of the 
2023 Breakthrough Prize in Fundamental Physics. 
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We have displayed 
f(x) = c*modN (red curve). The blue points represent the 
(discrete) periodic function over the integers. We have chosen 
c = 2 and N = 21. The period equals 6. 


if 


Figure 1.4.14: The periodic function. 


The algorithm. The algorithm for factorization consists of 
three steps. 

(i) construct a particular periodic function modulo N, 

(ii) determine the period of that function, 

(iii) given (i) and (ii) one can use an Euler method for find- 
ing largest common divisors to find the factors a and b 
such thatab=N. 


(i) Construction of the periodic function. Choose an integer 
c and consider the function* 


f(m) =c™ mod N , m integer. (11.4.17) 


(ii) One can show that this function is periodic with a period 
we call r, so, 


flm +r) = f(m). (11.4.18) 


After substitution of f on both sides, it then follows that 


c'=1modN > c'—1l=sN, 


“The number m = M mod N is obtained by subtracting N from M 
until a number between 0 and N is obtained, which is the number m. 
In other words M = m+ kN for some k withO <m< N. 


where s is some integer. Now rewrite the left-hand side 
as: 


(2 +1)(c/? -1) =sN, 


where we need r to be even for the factors to be integers. 
If r happens to be odd, one has to restart by choosing a 
different value for c and start all over. 

(iii) The next step is to find the greatest common divisor of 
the individual factors on the left with N , after which one 
obtains the prime factors a and b of N . This last step can 
be done with an Euler subtraction scheme. 


The hard part of this solution method is to find the period 
r of the function f(m) because this r may be of order N 
itself. Determination can be done using a fast or integer 
Fourier transform of f(m). 


As we discussed wavefunctions, and non-commuting op- 
erators as hallmarks of quantum theory it is maybe nice to 
paraphrase this hard side of the problem and to see that 
quantum measurement is the clue. Firstly think of the func- 
tion f(m) as a wavefunction on a one-dimensional lattice 
corresponding to the natural numbers 0,1, 2,3,... . Now 
we also have discussed a momentum operator P which 
translates the position variable by one unit. And acting on 
the function it acts like P f(m) = f(m + 1). Because of 
the periodicity of f(m) we also have the relation P’f(m) = 
f(m +r) = f(m) from which we conclude that P" = 1. 
From which it follows that the eigenvalues of the P oper- 
ator are p = e?™s/" with s = 0,1,...,r. In other words 
doing a measurement of the momentum of the state de- 
scribed by the wavefunction f(m) tells us basically what r 
is? What we end up with is a periodic function with a sup- 
port of r points on a circle and dual to that the momentum 
sample space also consisting of r points. This is of course 
due to the periodicity of the function. | recall the statement 
about the relation between the sample spaces of position 
versus momentum operators. A line is dual to a line. If the 


5One may need more than one measurement, but one can check 
that rather easily. 


QUANTUM COMPUTATION 


375 


-1/2 
. Im F(k) 


Figure ll.4.15: The Fourier transformed function F(k) . We have 
displayed Re F(k) (blue curve) and Im F(k) (red curve). The 
peaks at multiples of 1/6 stand out clearly, even in this crude 
‘iPhone’ approximation causing some noise. This means that 
the periodicity of the original function f(m) (the blue dots in the 
previous figure) would be 6. 


x-space is infinite discrete then the sample space of the 
dual momentum is a angle or a circle, by bringing in the 
periodicity only a set of r points on the circle is left corre- 
sponding to the corners of a polygon. And in that case the 
P and X sample spaces are again the same. There are ba- 
sically two identical polygons and there is a unitary trans- 
formation between the frames that correspond to the sets 
of eigenvectors corresponding to the eigenvalues. Stated 
differently the problem of factoring is to a large extent find- 
ing the right polygon hidden in the circle and indeed there 
are many (a countable infinity) to choose from. 


The fast Fourier transform of a function F(n) is defined 


as: 
Rijs) tomer, (11.4.19) 


which combined with the fact that f(n) has a period r leads 
to a powerful conclusion on the function F(k) . For the func- 


tion (l1.4.17) it leads to the strong condition: 


eir ] a k=s/r:s=1,...,1. 


What this means is that we ask for the transformation of a 
(wave) function f(x) on a one-dimensional infinite lattice, 
from the position state basis to a momentum state basis. 
We know that the momentum values for an infinite discrete 
space correspond to an angle 0 < © < 2r where in 
our case 0 = 27 k = 27 s/r. So what we learn is that 
the function f involves only r different momentum states. 
The fast Fourier transform just measures the momentum 
and determines the component of that momentum eigen- 
state. The magnitude of that component is not so relevant 
as what the actual allowed momenta are. So the momen- 
tum state is almost everywhere zero except in points that 
correspond to the corners of a polygon with r sides where 
they have the value F(k). 

Wouldn't it be fun to find an example where we would be 
left with a pentagon, what do | say, THE pentagon, in mo- 
mentum space? Maybe that explains the Pentagon’s inter- 
est in quantum computing and maybe they knew all along 
that the pentagon would play an important role somewhere.... 


So the data we need from the fast Fourier transform just 
corresponds to one or more measurements of the momen- 
tum in the state f . That will give us a value(s) p = 27ts/r 
from which r can be determined. So it is now clear that 
quantum measurements implement an extremely efficient 
algorithm for fast Fourier transform on integer-valued func- 
tions. You just have to measure the non-commuting ob- 
servable dual to the variable of the function, and that is the 
momentum. And that is the quantessence of super fast 
factorization. 


Let us work out a simple example, and let us try to factor 
the number N = 21 with the algorithm. We first construct 
the function f(x) = 2% mod N, it takes the values given in 
Table II.4.1. We see that the function has a period r = 6, 
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Table 11.4.1: Tabulation of the function f(x) = 2* mod 21. 


8 9 10 
4 8 16 


x | 0 
1 


3 4 5 6 
f(x) | 8 1 


1 2 7 
2m4 16 11 2 


so we obtain the equation: 


(23+1)(22-—1)=9x7=21xs. 


Now determine the largest common divisor from the fac- 
tors on the left with 21: 

21 mod 7 = 0 > 7 is a factor of 21, and 21 mod 9 = 3 => 
3 is a factor of 21. Thus we established the magical result 
that 21 = 3 x 7, One could say that we at least succeeded 
in cracking a nut by using a magnificient sledgehammer. 


But to factor a 1000 digit number into two primes you will 
need this sledgehammer in the form of a sizable quantum 
computer to find the period, which after all might well be of 
the order of N itself! 


Applications and perspectives 


Quantum computation and security are challenging exam- 
ples of the surprising interplay between the basic concepts 
of physics and information theory. If physicists and engi- 
neers succeed in mastering quantum technologies to al- 
low for reliable and scalable qubits, it will mark an impor- 
tant turning point in information science with profound so- 
cietal consequences. We had better get ready for an era 
of quantum supremacy! 


Hardware developments. As we mentioned alrerady, at 
present there is a lot of work in progress trying to imple- 
ment quantum computing in a wide variety of ways. | will 
refrain from going into any detail here firstly because that 
calls for many different types of expertise, and furthermore 
the developments go so fast and still make so many unex- 


pected turns that | would run the risk that this book would 
already be out-of-date before it was published. It is abso- 
lutely clear however that basically all big tech companies 
are actively pursuing the quantum opportunities that suits 
them. In principle all that is needed to make a qubit is a 
simple two-level quantum system that can easily be ma- 
nipulated and scaled up to a large number of qubits. The 
first requirement is not so restrictive, and many different 
physical implementations of systems with a single or a few 
qubits have been realized, including NMR, spin lattices, lin- 
ear optics with single photons, quantum dots, Josephson 
junction networks, ion traps and atoms and polar mole- 
cules in optical lattices. 


The much harder problem that has so far limited progress 
toward practical computation is to couple the individual 
qubits in a controllable way and to achieve a sufficiently low 
level of decoherence. Even small local perturbations due 
to the environment could destroy the delicate phase infor- 
mation in the linear superposition of states. With respect 
to this problem, a promising venue has surfaced with the 
advent of Topological Quantum Computing where quan- 
tum information is stored in topological degrees of free- 
dom that are insensitive to local perturbations and interac- 
tions, making error correction procedures simpler to imple- 
ment. This way of computing involves new states of mat- 
ter, that exhibit what is called topological order. In Chap- 
ter lll.3 we'll say more about this. On the software side 
impressive progress has been made, building on the fun- 
damental quantum algorithms we have mentioned. There 
is of course also the possibility of developing hybrid classi- 
cal/quantum devices. Nevertheless, with the great efforts 
now taking place, future developments could be surpris- 
ingly fast. 


The challenge of quantum software. We are in a sit- 
uation that looks like the early seventies where many in- 
stitutions in what still was Silicon Valley to be started fo- 
cussing on developing software for digital devices like PC’s 
and laptops, that weren’t really there yet. This major effort 
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was to a large extent based on the strongly held belief that 
a digital era was on its way where every individual would 
own powerful devices, to play and work with. High level 
languages had to be developed to allow everybody to op- 
timally process data, whether it concerned text, pictures, 
symbolic manipulation or music. It turned into an unprece- 
dented show-case of public and private research and de- 
velopment efforts, which resulted in the present informa- 
tion era which in many ways has profoundly changed the 
human condition. 


We are now in acomparable situation with respect to quan- 
tum computing. And again, even though the hardware 
is still quite remote, a strong case for quantum software 
should be made. If we were to have quantum comput- 
ers at our disposal, the question of what miracles could 
they possibly perform strongly depends on the software 
that is available. We said in the introduction to this sec- 
tion that there are many problems where the intrinsic mas- 
sive parallelism of quantum evolution might yield dramatic 
speedups in computation. The point is not that a classical 
computer would not be able to do the same computation — 
after all, one can always simulate a quantum computer on 
a classical one — but rather the time that is needed could 
drastically be reduced. 


As we just discussed in some detail, a most spectacu- 
lar speedup is achieved by the Shor algorithm (1994) for 
factoring large numbers into their prime factors. Because 
many security keys are based on the inability for digital 
computers to do this, the reduction from an exponentially 
hard to a polynomially hard problem has many practical 
applications for breaking security codes and current cryp- 
tography. This means that even today, one has already to 
worry about how one should save sensitive information, to 
make sure that it cannot be easily retrieved in the near 
quantum future. Quantum algorithms also allow one to 
provide new more secure crypto-codes that in principle al- 
low users to run programs on untrusted systems and still 
keeping their data secret. 


Another important application is the quadratic speedup by 
Grover’s search algorithm (1996) over conventional search 
algorithms, addressing for example problems like the ‘trav- 
eling salesman’, in which large spaces of possibilities need 
to be searched and compared. 


Machine learning is another hot topic where the discovery 
of an exponential speedup for solving certain systems of 
linear equations has led to flurry of new developments like 
algorithms for core problems like data fitting and support- 
ing vector machines. 


Finally, a vital application is the efficient simulation of quan- 
tum physical and chemical systems, which at present is 
an extremely costly business taking up much of our su- 
percomputer capacity. This development is of importance 
to fields like chemistry, material science and high-energy 
physics. In this area a quantum computer naturally would 
offer an exponential speedup, which in turn would directly 
feed back into the successful development of new quan- 
tum technologies. Science is time and again an incredi- 
ble innovation engine, we are standing at the dawn of a 
new era and wonder where quantum technologies will lead 
us. 
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Chapter II.5 


Particles, fields and statistics 


In fact the smallest units of matter are not phys- 
ical objects in the ordinary sense; they are forms, 
ideas which can be expressed unambiguously only 
in mathematical language. 

Werner Heisenberg 


In Chapters II.1 and II.2, we mainly focussed on the qubit, 

because in its simplicity it was most suitable to demon- 

strate the quantessentials. In this chapter we turn to par- 

ticles and fields. We start by discussing the one-particle 

Schrödinger and Heisenberg equations in more detail. Next 
we turn to fields and their quantization, and explain how 
the resulting Hilbert space describes multiparticle states. 

We close the chapter with a discussion of the topological 
origins of indistinguishability, Pauli’s exclusion. principle 
and the spin-statistics connection. 


Particle states and wavefunctions 


Whereas the state of a single particle in classical physics 
is fixed by specifying its position and its velocity, i.e. by 
giving 6 numbers, the state of a quantum particle is spec- 
ified by giving its wavefunction, a continuous function that 
extends over all of space. How different can the quantum 
world be? 


Figure |I.5.1: Moving particles. Various particle motions as a 
function of time (t) in configuration (x) space. A particle succes- 
sively: at rest (orange), moving with constant momentum (pur- 
ple), in an oscillatory motion (red), and in a damped oscillation 
(blue). 


Phase space. Let us consider a single particle with a 
given mass m and assume that it has no internal structure. 
In classical mechanics we specify its state by just saying 
what its position x and its velocity v or momentum p = mv 
are. Once we fix its position and momentum at a given 
instant in time, Newton’s laws would do the rest, given the 
force they completely determine the future states of the 
particle. The motion of the particle can be thought of as an 
orbit or trajectory parametrized by time in ordinary three- 
dimensional position or configuration space of the particle. 
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Figure ||.5.2: Particle motions. The same particle motions as 
in Figure |I.5.1 as a curve parametrized by t in phase space 


(x(t), p(t)). 


In the one-dimensional case we would plot the position 
x = x(t) with the value x on the vertical axis as a func- 
tion of t along the horizontal axis. Alternatively we may 
think of the motion as a time parametrized curve through 
the combined momentum and position space which is also 
called the phase space of the particle. The phase space 
has twice the number of dimensions, because to the d 
components of the position vector one has to add the d 
components of the momentum vector. 


We have given some examples of one-dimensional particle 
motions in the Figures II.5.1 and II.5.2, showing what they 
look like in configuration as well as phase space. So far 
the classical story of a particle. 


wavefunctions. The story in quantum mechanics is very 
different. There the state of a particle at a given time t is 
described by its wavefunction p(x, t) which is a function 
that even for a single particle is defined over all of posi- 
tion (configuration) space.' Note, however, that we do not 


'For readers who are not already familiar with the notion of func- 
tions and what you can do with them | recommend looking at the Math 
Excursion on page 607 of Part III. 


specify its velocity. If we just give the wavefunction over all 
of space at some initial time, then the Schrédinger equa- 
tion would generate the future states given the expression 
for the kinetic energy and potential energy. The Schrédin- 
ger equation determines the time evolution of the wave- 
function which in turn describes the particle state, and 
in that sense does for a quantum particle what Newton’s 
equations did for the classical particle. We encountered 
this equation before in Chapter 1.4 on page 158 but we will 
recall some of the results here for convenience. Our in- 
tuition about particles is deeply rooted in the Newtonian 
paradigm in that we think of a particle having a definite 
position a definite velocity, and that image is of course a 
long way from specifying some smooth function over all of 
space. Indeed this is nothing less then a conceptual leap 
that took the brightest minds a long time, first to bridge, 
and later to really swallow. 


Particle-wave duality 


In classical physics the particle and wave concepts are dis- 
tinct and mutually exclusive. In quantum theory a particle 
may manifest itself in both guises. Here the concept of 
complementarity rears its head. The concept of a quantum 
particle transcends the classical distinction and appears to 
be both. Niels Bohr applied the wave picture to atomic or- 
bits and obtained a discrete set of energy levels of which 
the lowest one is stable. A new door for fundamental phys- 
ics opened up. 


The vastly different framework of quantum mechanics we 
just outlined expresses the quantessential feature known 
as particle-wave duality. The wavefunction expresses the 
wave nature of a particle and the Schrédinger equation is 
a wave-type equation for the matter-wave that represents 
the particle in quantum theory. In the early days one re- 
ferred therefore to quantum mechanics as ‘wave mechan- 
ics. That term sounded in the classical context rather like a 
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x,t) 


Figure II.5.3: Particle probability density. A quantum probability 
density of a particle V(x, t) as a function of x and t . It describes 
a particle at rest, well localized around the origin for t = 0 and 
then spreading out (disperse) over space as time progresses. 


contradiction in terms, because in classical physics, parti- 
cles and waves are fundamentally different concepts. Par- 
ticles are supposed to be very much localized, while for 
waves the opposite holds, they typically are spatially ex- 
tended. Particles can collide locally and exchange mo- 
mentum and energy like billiard balls, while waves ‘inter- 
act’ typically by interference where the combined waves 
show a particular pattern of maxima and minima like water 
waves in a pond. We may ask what the special proper- 
ties of a wave representing a particle are, or for that matter 
what the particle properties are of a wave, for example an 
electromagnetic wave. 


Photons. To start with the latter, it was one of Einstein’s 
seminal contributions to quantum theory to postulate the 
so-called photon as the quantum particle of light. Its defin- 
ing properties are that this particle moves with the velocity 
of light, has zero mass, and an energy E = hv, where v 
is the frequency of the light wave.” Thus a steady electro- 


This may at first sight seem problematic perhaps, because if a 


magnetic wave of a single frequency would correspond to 
a constant flux of particles with a fixed energy or momen- 
tum. The quantization of energy of radiation of a given 
frequency implied that the minimal amount of energy of a 
wave with frequency v had to just be hv, and this quan- 
tization of energy was exactly what the radical postulate 
of Max Planck amounted to, the postulate which started 
off the whole quantum revolution. It was this assump- 
tion which rescued the classical black body radiation law 
of Rayleigh-Jeans from its demise in the high frequency 
domain as we pointed out in Chapter 1.2. 


Matter waves. It was the French physicist De Broglie who 
turned the relation around. He postulated the existence of 
matter waves: for any particle type with a mass m, the 
wavelength had to satisfy the relation A = h/p , linking the 
wavelength to the momentum. This relation is consistent 
with Einstein’s formula E = hv once you realize that for a 
massless particle according to special relativity E = cp as 
we pointed out in Chapter 1.2, and that for a lightwave we 
have that A =c/v. 


The Bohr atom. Furthermore this picture of a matter wave 
was at the heart of the atomic model of Bohr, where a def- 
inite energy state of an electron would have a single wave- 
length but to make it periodic, it had to fit exactly on the 
classical circular orbit with that energy. Imposing this re- 
lation lead to the quantization of the wavelength, and thus 
of the momentum and therefore also to the quantization 
of the allowed energy for the atomic states. Bohrs pic- 
ture of the atom predicted the discrete spectrum of energy 


particle has mass equal to zero would then Einstein’s own dictum — 
E = mc? — not decree that its energy would be zero as well? Not re- 
ally, because we have to make the distinction between the rest mass 
mo of a particle and its relativistic mass m. These are related by 
m? = má + (p/c)*, showing that (i) if the momentum p = 0 indeed 
m = mo, and (ii) that if mo = 0 then m = |p|/c. This tells us that 
in the latter case where the rest mass is zero, the relativistic mass is 
proportional to the momentum of the particle. Therefore, in relativity 
massless particles make complete sense and the photon is the om- 
nipresent manifestation of that. 
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levels but most importantly also the existence of a lowest 
energy or ground state for the atom. The ground state cor- 
responds to the largest wavelength that would fit on the or- 
bit, i.e. being equal to that orbit. This point is all-important, 
exactly because the classical realization of an atom lacks 
a true ground state, the system would be unstable and the 
electron would fall into the nucleus in a short time, loos- 
ing energy by radiating. So the extremely stable atom as 
we know it in nature severely violated the laws of classi- 
cal physics, and that was one of the reasons we had to 
give up, not just on the naive model of the atom but on 
the whole of classical physics! It was quantum theory that 
provided a fundamental understanding of the stability of 
matter. 


Where is the particle? If a particle is represented by a 
wavefunction, the first question that comes mind is: ‘but 
what about the position of the particle?’ | have told you 
what the momentum of the particle is but where is it? In- 
deed, where is the particle if it is a kind of standing wave 
spread out around the nucleus? A perfect monochromatic 
wave has in principle an infinite extent. It is a periodic 
function like a sine or a cosine, but how can such a func- 
tion ever single out any particular position for the particle? 
Well, you are right, it cannot. 


The resolution of this tantalizing paradox has to do with 
the interpretation of the wavefunction and what it means 
to make a position measurement of a particle. We have 
touched on these matters already in Chapter II.2 where we 
learned that this comes about because of the incompati- 
bility of different observable quantities and the frameworks 
that limit the degree to which questions may or may not 
have meaningful answers. For the moment we accept the 
euphemism that Niels Bohr invented for this inconvenient 
truth of particles being waves and vice versa: he called it 
complementarity. We return to these questions explicitly 
shortly. 


The space of particle states 


We extend the symbolic mathematical representation from 
qubit to particle states. It is profitable to also think of wave- 
functions as state vectors. The square of the wavefunction 
defines a probability distribution of where to find the parti- 
cle. 


In previous chapters we looked at the space of quantum 
states of a system that classically corresponds to a system 
with a finite number of states, like an array of qubits. Now, 
we want to extend this discussion to a system of a particle 
with mass m that moves in Euclidean space. The essential 
difference is that the classical configuration space is now 
continuous. 


Hilbert space heuristics. Essentially, making the step 
involves going from a discrete to to a continuous space 
and that is from a mathematical point of view a subtle mat- 
ter. For that reason we will restrict ourselves here to rather 
heuristic arguments. If a particle could sit only in a discrete 
set of positions x;(i = 1,..., N), then of course the anal- 
ysis is reduced to the one we had in the previous chapters 
and we would introduce a set of corresponding basis vec- 
tors |x;) , which would be eigenvectors of the position op- 
erator X and hence satisfy the eigenvalue equation: 


XĪxi) = xilxi) ) (11.5.1) 
and the state vector would be written as hp) = }_; œilxi). 
The natural generalization for the continuous space case 
to write the following expression for the quantum state of a 
particle: 


hp) = [vo dx. (11.5.2) 
All we know about the particle state hp) is that the state 
is encoded in the corresponding complex function p of 
the continuous position variable x. The sum over the dis- 
crete subscript i gets replaced by integral over the contin- 
uous variable x , which is symbolically written as f --- dx. 
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Figure |I.5.4: Harmonic oscillator wavefunctions. wavefunc- 
tions of the three lowest energy states Yn (x) with n = 0,1,2, 
of a quantum oscillator. The label n also gives the number of 
nodes: the even n functions are symmetric the odd ones are 
odd under x > —x. 


And indeed, (x) is just the famous wavefunction that ap- 
pears in the well-known Schrödinger equation we will get 
to later. To give you an idea we have depicted the three 
lowest energy states of a particle in a harmonic oscillator 
potential in Figure 11.5.4. These will be discussed in more 
detail shortly. Talking heuristically one may say that the 
wavefunction represents nothing less than a vector in an 
infinite-dimensional vector space. In fact p(x) is the ’|x) 
component’ of the state vector hp} which suggests that we 
should write it as such: 


(x) = (xh), (11.5.3) 


leading to the expansion of the wavefunction in ‘position 
eigen states’, 


hb) = | Ix) (xh) dx. 


We have to make sure that we impose the normalization 
condition just as we did in the discrete case, in strict anal- 
ogy it reads: 


(yy) = [voe dx = | (x)? dx =1. (11.5.4) 


Figure II.5.5: Harmonic oscillator probabilities. The n = 2 
wavefunction (red) and probability density (purple). The prob- 
ability Po2 is given by the shaded area. We talk about these 
states in detail in a later section on page 395. 


So what we learn is that quantum states of particles de- 
fined on a configuration space ¥ correspond to elements 
of the space H of (complex) functions on ¥ which are 
‘square integrable’, meaning that they have to satisfy the 
condition (11.5.4). This space of square integrable functions 
is called the Hilbert soace. One can also define a scalar 
product on the states that — not surprisingly — takes the 
form: 


(bhp) = | p (x(x) dx, 


completely analogous to formula (II.1.4). This once more 
underscores the exceptional elegance of Dirac’s bra and 
ket notation. 


You could say that by going from classical to quantum de- 
scription we transcend from some space of coordinates 
to the space of functions on that space of coordinates. 
The difference with the description of the classical state is 
rather dramatic indeed, and you may wonder how to make 
sense out of it. What is the link of the wavefunction which 
is defined over all of space and the ordinary point-like par- 
ticles we observe? 
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Probability interpretation. The interpretation is also com- 
pletely in line with what we expect from the discrete case: 
w(x) is the (complex) probability amplitude for the proba- 
bility p(x) of finding the particle at point x. The absolute 
square of the amplitude p(x) = hp (x)|? defines a probabil- 
ity density, and hence the probability Pa» of finding the par- 
ticle in the range a < x < b can be expressed as: 


b 
Pab = p(x) dx. (11.5.5) 


a 

Formulas like the ones we have displayed in this section 
may at first look a bit daunting, and you may ask what the 
hell they mean. Well stay tuned in because it is not hard 
to visualize at all; the probability Pa» is just the area under 
p(x) if you plot it as a function of x, between the points 
x = a and x = b; This is depicted in Figure 11.5.5 and 
for more details we refer to the Mathematical Excursion on 
functions in Appendix A of Part III. 


As a matter of fact physicists love the bra and ket nota- 
tion, it is compact and convenient to work with and it also 
keeps the conceptual structure of expressions remarkably 
transparent. And often progress originates in designing an 
optimal symbolic representation and notation. 


This for the moment concludes our description of the space 
of quantum states that corresponds to a classical system 
with a continuous configuration space such as a particle 
moving in ordinary space. We saw that it is described by a 
complex wavefunction that may be considered as the com- 
ponents of a vector in an infinite-dimensional vector space 
of normalizable vectors which is called the Hilbert space. 
And we have mentioned that the square of the wavefunc- 
tion corresponds to a probability density for where the par- 
ticle can be found. 


There are other pressing questions that immediately come 
to mind. You may ask: where did the velocity of the particle 
go, it appears nowhere in the specification of the quantum 
state? And what about its energy? Your point is well taken 


indeed — thank you — and we will return to the question of 
how, and to what extent, a precise velocity or momentum 
or energy can be assigned to a particle in the next section. 
But before we do so, | want to discuss an explicit example 
of a set of wavefunctions for a particle that lives not only 
in one dimension, but on a circle, which is a finite one- 
dimensional space without boundary. 


A particle on a circle 


In this subsection we turn to a concrete example and look 
at a quantum particle that lives on a unit circle with an an- 
gular coordinate 0 < ọ < 27. This may strike you as 
a particularly useless theoretical problem, but one should 
be careful with those judgements. A lot of applications 
of physics and in particular quantum physics have to do 
with settings that are effectively low dimensional. Quan- 
tum wires are one-dimensional. A particle that is confined 
to the edge of a planar disc lives on a circle. In fact the 
groundbreaking Bohr-model of the atom amounted exactly 
to quantizing a particle on a circle, as he basically quan- 
tized the particle on classical circular orbits. Another ex- 
ample are ‘quantum dots’, which are basically finite two- 
dimensional domains on which particles can live. 


A particle on a circle will be described by some complex 
wavefunction wp(@) that is normalized but also has to sat- 
isfy a continuity or periodicity condition such that? (@) = 


ple + 27). 


3|t is more precise to say that this is a condition one imposes a priori 
on physical grounds. If there is some defect on the boundary one could 
well imagine to impose a different, non-trivial boundary condition, for 
example w(@ + 27) = e'w(@). A more sophisticated treatment of 
the problem would be to say that we extend the set of observables to 
arbitrary translations x and decompose these into x = 2n7 + @. The 
discrete translations by 2n7 form an invariant subgroup Z of the group 
of translations on the real line R ; the different boundary conditions form 
representations of this Z group and these are labeled by the angle 0 < 
y<2n. 
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Figure II.5.6: La Danse. Circle dance by the French painter 
Henri Matisse, painted in 1910. (©Succession Henri Matisse.) 


Momentum eigenstates. The periodic solutions are of 
the form 


(olk) = plo) = 4/5 e. (11.5.6) 


You would expect maybe periodic functions like cosines 
and sines, but as we allow complex functions it is much 
more natural to write then as simple exponential functions, 
and in a sense it amounts to the same thing because of 
that beautiful Euler identity et? = cos(k@) +isin(k@) as 
is explained in the Math Excursion about complex numbers 
on page 607 of Part Ill. The periodicity condition leads to 
the condition that e?"** = 1, which is satisfied only if k is 
restricted to integer values. 


Observe that these periodic states ,(@) have a wave- 
length A = 27t/k and using the relation of De Broglie A = 
h/p says that in these particular periodic states the parti- 
cle carries a momentum px = hk. What about the energy 
of the particle? If we think of a free particle with no force on 
it, the energy would just be the kinetic energy of the states 
Ek = pz/2m and therefore grows proportional to k*. At 
this point, however, we could also assume that the particle 
is a relativistic particle, in which case the expression for the 


Figure II.5.7: Going in circles. We have depicted the k = 5 real 


wavefunction p(@) = Tad cos(5@) of a particle on a circle, 


energy of the k-th mode would be Ex = ,/pzc? + m2c* 


which for small momentum reduces to the previous ex- 
pression but for large pp we would get Ex œ ppe which 
is proportional to k. We can also immediately calculate 
the probability distribution for the states to equal: 


1 


ple) = 4klelpk(e) = ple)plo) = In’ 


This probability density for where to find the particle is con- 
stant! This tells us that whereas the momentum of the 
particle is completely fixed with zero uncertainty, the posi- 
tion of the particle is maximally uncertain, because it corre- 
sponds to a uniform distribution over all of space. In these 
states there is no preference whatsoever for any position 
or region. The conclusion is that in this momentum frame- 
work the particle logically speaking has no position. A dra- 
matic instance of the Heisenberg uncertainty principle. We 
return to this point later on, when we will show wavefunc- 
tions which to a certain extent look more like localized par- 
ticles. These wavefunctions will be particular linear super- 
positions of this set of momentum eigenstates. 


Just like in the discrete case for a general state we may 
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Figure II.5.8: Going in circles. The probability distribution 


e(~) = s cos?(5@) of a particle ona circle. 


write a general expansion like: 


tb) = > lk >, (11.5.7) 
k 


where the basis states |k) would equal 


Ik) = [ow lp) dọ = a fet lp) dọ. 


These states form an orthonormal basis, meaning that they 
satisfy the orthonormality condition* 


(kik’) = [ (ei) (olk’) ap E 


Now the sum in. the expression (II.5.7) extends over all in- 
teger values of k, indeed confirming our expectation that 
the quantum state of a particle on a circle is like a vector 
in an infinite-dimensional space. There is also a complete- 
ness relation for the basis in analogy with equation (11.2.10) 


which reads 
2 Ik)(kl =1. 
k 


“To prove it you need the functional relation that x fexp(i(k — 
kJ) dep = bx. 


Position and momentum operators 


In the earlier chapters we have been talking about quan- 
tum dynamical variables as operators or matrices. For 
example for the qubit we showed in Chapterll.1 that we 
could interpret the Pauli Z-matrix as the position operator, 
and the Pauli X-matrix as the momentum operator. So the 
first question that comes up if we think of particle states 
as wavefunctions, what the operator valued observables 
should look like. Something like infinite-dimensional ma- 
trices maybe? The answer is simpler than that and quite 
natural if you think of operators that have to act on func- 
tions. You can multiply functions by other functions, but 
more importantly we can differentiate functions. We should 
expect dynamical variables to be represented by differen- 
tial operators. So let us first consider the momentum oper- 
ator. 


Momentum. In this section we look at the definition of 
momentum and position operators for a particle on a cir- 
cle. The state vectors are the wavefunctions 1p;(@) given 
in equation (11.5.6), we will show that the momentum corre- 
sponds to the differential operator, 


Bea, 

dep 
First observe that the functions p(x) are eigenfunctions 
of P , because, 


Pix(e) =hkyx(@), 


and recall that we argued in the previous chapter based 
on De Broglies heuristic argument that the momentum of a 
particle in the k-th state is indeed equal to p, =hk. 


Generator of translations. At this point you might want to 
look at the Math Excursion on page 607 of Part Ill, where it 
is shown that the displacement of the state vector or wave- 
function is also generated by the derivative or differential 
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operator? K = TE We can formally put it in the expo- 
nent, just like the sigma matrices before, having the prop- 
erty: 


eb. (p) = epe) = ple +8). (11.5.8) 


We see that this exponential operator just shifts the argu- 
ment of the function by the factor in front of K in the ex- 
ponent. This equation is the precise mathematical expres- 
sion of the statement that the momentum operator (in fact 
P/A to be precise) acting on a function ‘generates’ spatial 
translations of its coordinate (position). 


The position operator. What about the position opera- 
tor M ? It acts on the wavefunction as ® w(~) = gi(e), 
i.e. by just multiplying the wavefunction by the variable ‘op’. 
Note, however, that the ,(@) are eigenfunctions of mo- 
mentum P but not of position ® (because k is a constant 
and ọ is not, it is a coordinate, a variable). So the posi- 
tion operator ‘multiplies’ the wavefunction with the function 
p. 

It may be useful to again point out the analogy with the 
qubit case in Chapter I1.2, where the would-be position op- 
erator was Z and a would-be momentum operator could 
be X. We could then consider the states |+) defined in 
equation (II.2.4), which are eigenstates not of Z but of X, 
because X|+) = \. And indeed, acting with Z on, 
for example the X eigenvector |+) would multiply each 
component with a different coordinate value, leading to 
Z|+) = |—). So, acting with the coordinate operator on a 
momentum eigenstate changes it to another state. 


Canonical commutation relations. Being eigenfunctions 
of momentum, the px (ọ) are also eigenfunctions of a Ha- 
miltonian H = P?/2m describing a free particle of mass 


5The difference between P and K is a matter of units or dimen- 
sions, the dimension of the differential operator is [1/length] to get 
the dimensions of momentum we have to multiply by a constant with 
dimension [lenght x momentum] = [joule x second] and yes — not 
surprising — that constant is nothing but Planck constant h . 


m that moves on a circle. We also see that as we might 
expect the momentum and position operators do not com- 
mute, they satisfy the so-called canonical commutation re- 
lations: 


[X, P] = ih. (11.5.9) 


To see that this is true it is most convenient to think of the 
commutator as an operator working on a (wave) function 
f(x) , then we obtain: 


[X,P] f(x) = -Xiħ + A— X f(x) 


As the function appearing on both sides of the equation 
is arbitrary we may conclude that the statement (II.5.9) is 
true as a property of the operators. 


Raising and lowering. Let us ask for the raising and low- 
ering operators of this problem. Let us first try to find op- 
erators Qi that satisfy the commutation relations: 


[P, Q+(X)] = +aQ+(X), (11.5.10) 
and as aQ 
we obtain an equation for the functions Q4 (x) : 
PAO) ke 
—ih ae aQ+(x). 


The solutions to this equation are Q(x) = cexp(tiax/h) , 
and therefore one obtains for the operators: 


Q(X) = cee, (11.5.11) 


The interpretation is now as follows. The momentum of 
a particle on the circle has a discrete spectrum {hk} with 
integers co < k < ov, for clockwise and counterclockwise 
moving particles. The smallest possible momentum state 
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has k = 0 and the raising and lowering operators (11.5.11) 
sequentially generate all the eigenfunctions p(x) if we 
choose a = 1. We clearly have to adjust the value of a 
to comply with the imposed boundary condition. 


Heisenberg’s uncertainty. It is amusing to check the Hei- 
senberg uncertainty relation by verifying that indeed Ax = 
Land Ap = po = ħr/L satisfy: 


AxAp = ħr> li(wollX, Plo)! 
= polpo) =A. 


We see that these states do not saturate the lower bound 
on the uncertainty relation. That lower bound is h/2, as 
we showed in Chapter II.2 on page 317. 


Energy generates time evolution 


Time evolution. If we talk about time evolution of a classi- 
cal system, we think in the first place of Newton, but in the 
realm of computation we also think of the physical imple- 
mentation of a sequence of logical gates. A computation 
is in that sense a discrete dynamical process whose rate 
is set by the speed or clock time of the chip, today being of 
the order of nanoseconds. We process information by ma- 
nipulating it through interacting with it in a controlled way 
by having logical gates acting. That is similar to applying 
a force to get ourselves moving as we saw in the previous 
chapters. Now even in the heyday of classical mechanics 
many different approaches were formulated in attempts to 
solve specific dynamical problems by people like Hamil- 
ton, Jacobi, Laplace, Lagrange, Legendre and others. We 
discussed some of them in Chapter I.1 on page 16. 


In Figure II.5.9 we have indicated various paths that lead 
from the domain of classical mechanics to the correspond- 
ing quantum equations. | am going to discuss them se- 
quentially, and start with the Schrödinger equation. 


Wave mechanics: the Schrodinger equation 


The wavefunction of a quantum system evolves in time ac- 
cording to the famous Schrédinger equation. Dynamical 
changes in a physical system are induced by the underly- 
ing forces acting on the system and between its constituent 
parts, and their effect can be represented in terms of what 
is called the energy or Hamiltonian operator H . For a sin- 
gle qubit system the operators can be represented as 2 x 2 
matrices, for a two-qubit system they are 4 x 4 matrices, 
etc. The Schrédinger equation can be written 


dhp(t)) 
dt 


ih = H)p(t)). (11.5.12) 
This is a linear differential equation expressing the prop- 
erty that the time evolution of a quantum system is gener- 
ated by its energy operator. Assuming that H is constant, 
given an initial state |p (0)} the solution is simply 

hp(t)) = U(t)hp(0)) with U(t) =e" (11.5.13) 
The time evolution is unitary, meaning that the operator 
U(t) satisfies WUT = 1. 


ul = exp(—iHt/h)t 

=exp(iH't/A) = exp(iHt/A) =U". (11.5.14) 
Unitary time evolution means that the length of the state 
vector remains invariant, which is necessary to preserve 
the total probability for the system to be in any of its pos- 
sible states. The unitary nature of the time evolution op- 
erator U follows directly from the fact that H is hermitian: 
Ht = H. Any hermitian 2 x 2 matrix can be written 


a b+ic 
A= 
n —a ), 


where a, b and c are real numbers.°® 


é We omitted a component proportional to the unit matrix as it acts 
trivially on any state. We speak of the part that has no trace. 
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Stationary states. From equation (11.5.13) results it is also 
immediately clear what the importance is of the eigen- 
states of the Hamiltonian — the energy eigenstates. An en- 
ergy eigenstate |),,) satisfies by definition Hipan) = Enltn) 
and thus for such states: 

b(t) = exp(iEnt/Al) (0) . (11.5.15) 
The state is not quite time-independent, but it changes 
only by an overall phase factor, which means that the prob- 
ability density |ọ|?, or the expectation value of any operator 
will not change over time. The state is strictly speaking not 
static and therefore called stationary. 


Time dependence. But if we act on a state that is not an 
eigenstate of the energy, we get a time dependent solution. 
For the simple example of a single qubit, suppose the initial 
state is 


On the right, for the sake of convenience, we have written 
the state as a column vector. Consider the energy of a spin 
in an external magnetic field B directed along the positive 
z-axis.’ In this case H is given by H = bZ. Now the 
initial state is a linear combination of two different energy 
eigenstates. From equation (11.5.13) it follows that, 


u(t) = exp("Z) 
_ (exp(—ibt/2h) 0 
7 ( : oe es) . (11.5.16) 


We obtain an oscillatory time dependence for the state, not 
just a phase factor, i.e. 


te me 

hp(t)) = it (Sana J 
1). bt 
o Hisin = 


GJ] aa 


~ 


Poisson 


P 


ih(da/dt) = [Q,H] 


iħ(d¥/dt) = H Y 


Figure 11.5.9: Ways to go quantum. Various pathways from dif- 
ferent but equivalent formulations of classical mechanics to the 
Schröodinger and Heisenberg — also equivalent — formulations 
of quantum theory. 


The state oscillates between the |+) with probability p+ = 
|(+hp(t))2 = cos? bt /A and |—) with p- = |(—Thp(t))/? = 
sin? bt/h. This simple example applies in some form or 
another to numerous physically relevant two-level systems. 


We see that, in contrast to classical mechanics, the time 
evolution equation is first order in time and linear in the 
wavefunction. In general the Hamiltonian can be a compli- 
cated function of the basic dynamical variables and there- 
fore it is only in rare situations that one can find an exact 
analytic solution. On the other hand it is also surprising to 
see how a relatively small number of exactly solved prob- 
lems can serve to get a deep insight in, and feeling for, 
what kind of behavior quantum systems exhibit. 


7Quantum spins necessarily have a magnetic moment, so in addi- 
tion to carrying an intrinsic angular momentum they also interact with a 
magnetic field. 
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Matrix mechanics: the Heisenberg equation 


In the previous section we have considered the time evo- 
lution of the state generated by some particular Hamilton- 
ian which we assumed to be time independent. In that 
Schrödinger type description the operators one considers 
are mostly time independent and the time-dependent state 
hp(t)) is a solution to the Schrödinger equation with the 
given Hamiltonian which characterized the system and its 
interactions. There is a complementary view which was 
developed by Heisenberg, it is often called ‘matrix me- 
chanics’ which lead to the Heisenberg equation. In his 
view, which in a sense is closer to classical dynamics, the 
dynamical variables, meaning observables like matrices, 
are the objects that change in time whereas the state re- 
mains fixed. The simplest way to see how this comes 
about is to rewrite the definition of the expectation value 
of an operator A in a suggestive way as: 


(p(t)IAhp(t)) = Olete Hap) 
= (ap(0)|A(t)hp(0)) . (11.5.18) 


In other words we have defined time-dependent observ- 
ables for the system through the relation: 


AE E EA EEE, 


By calculating the time derivative of the above expression 
one arrives at Heisenberg’s quantum equation of motion: 


pA 
dt 
and we have an equation that tells us that the time evolu- 
tion of operators acting on the state space is generated 
by the commutator with the Hamiltonian of the system. 
Note that the commutation relations of observables are un- 
changed by the transformation, so we still have the canon- 
ical commutator [X, P] = ih. | would also like to remind the 
readers who happened to read my discussion on Poisson 
brackets on page 16 of Part |, that there is indeed a strik- 
ing similarity between the classical Poisson brackets and 


= [H, A(t)], (11.5.19) 


the Heisenberg commutator equations. The recipe is to 
make in equations (1.1.14) to (1.1.16) the following replace- 
ment 


oi 


{5 Jp TSR 


to obtain the canonical quantum equations! This was by 
the way the method Dirac used to ‘quantize’ systems. 


It is illuminating to keep both formulations in mind. Certain 
questions can be answered more easily in the Schrödinger 
picture and others in the Heisenberg picture. 


Symmetries and conservation laws. The Heisenberg 
equation yields a direct understanding of the existence of 
‘constants of the motion’ or conservation laws. For physi- 
cal variables described by operators Q that commute with 
the Hamiltonian i.e. [H, Q] = 0, the Heisenberg equation 
teaches us that dQ/dt = 0 and thus that Q is conserved 
in time. Such operators that commute with the Hamilton- 
ian are by definition called symmetry operators. You see 
that energy is one of them, and that had better be so, be- 
cause time independence of the Hamiltonian was after all 
our starting point. Depending on the system and its Hamil- 
tonian we will find out about the conserved quantities this 
way, like momentum, angular momentum, the Lenz vector, 
charge, isospin etc. Indeed summing up these examples 
one realizes how important these basic conservation laws 
are, as they allow us to characterize the states by prop- 
erties that are robust in time, and that allows us to label 
and assign names to things! After all, your name would be 
useless if it were to change every day. 


Degeneracies. The other consequence of having con- 
served quantities is that if Q acts on an eigenstate of the 
Hamiltonian then it may well make another state, but that 
state will have the same energy as the first one. You can 
use the conserved quantities or symmetry operators to 
generate ‘degenerate states’ in the spectrum. The state- 
ment is stronger than that, because you can always find 
enough symmetry operators to resolve all degeneracies 
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and label the different orthogonal states that are degen- 
erate in energy by labels referring to conserved quantum 
numbers. We saw this principle already at work in Chap- 
ter|.4 where we discussed the discovery of the electron 
spin. This was achieved by lifting the degeneracy by in- 
troducing an external magnetic field, which broke the rota- 
tional symmetry of the system. 


A framework of symmetry operators. So in choosing a 
framework, we often like to include the Hamiltonian as one 
of the operators. The next thing you may want to do is to 
add operators that commute with H, which in other words 
correspond to conserved quantities. These operators form 
a closed algebra in the sense that if Qı and Q2 commute 
with H, then also their commutator [Q1, Q2] will commute 
with H. This way we can construct a commutator or Lie- 
algebra of symmetry operators including the Hamiltonian. 
This algebra is then called the symmetry algebra for the 
system. 


Next we follow the instructions for a consistent framework 
and select, out of all those conserved quantities, a max- 
imal number which do mutually commute. That defines 
a sub-algebra of the full symmetry algebra, consisting of 
observables whose combined eigenvalues form the sam- 
ple space, for that framework. In fact such a maximal set 
of mutually commuting independent symmetry operators 
is called the Cartan subalgebra of the symmetry algebra. 
This algebra is named after the famous French mathemati- 
cian Elie Cartan, who succeeded in completely classify- 
ing all possible finite-dimensional (complex) Lie-algebra’s. 
Many of those play a crucial role in quantum physics. 


The next chapter is devoted to different kinds of symmetry 
and their breaking, and it will become clear that the no- 
tion of symmetry is one of the guiding principles that has 
played a leading role in the development of modern phys- 
ics. 


Generators of symmetries. So we have arrived at a rather 
quantessential picture linked to (continuous) symmetries. 
The operators Q that are conserved generate the symme- 
tries, and they can therefore be used to label the states, 
and furthermore they are physical observables. If | say that 
the Heisenberg equation just tells you that the Hamiltonian 
generates a time translation, what | mean is that an in- 
finitesimal change of an observable A in time, —ih dA/dt 
equals the commutator with the Hamiltonian. One can also 
write for example: 


„ dA 


—ih— = [P A 

dx [RA 

ROSAL 
dp 


which states that the operator dependence on position or 
momentum is generated by their ‘duals, the momentum 
and position operators respectively. Similarly the commu- 
tator with the angular momentum component L, gener- 
ates an infinitesimal rotation around the z-axis of the op- 
erators. We see that the Heisenberg equation is in fact 
one of many. It is the equation that expresses the time- 
translation-symmetry of the underlying space-time, from 
which energy conservation is derived. 


Classical lookalikes 


In our discussion of (free) particle states we have clearly 
found two extremes: (i) the momentum states |k) , where 
the momentum and energy have no uncertainty, but the 
uncertainty in position is maximal and (ii) the position states 
|p}, where the converse would hold. Neither of these 
seems close to what we think about when we talk about a 
classical particle moving on a circle. We know that we are 
free to consider any state of the type given in(II.5.2) and 
therefore we can ask whether it is possible to find a partic- 
ular linear combinations of basic quantum states that look 
more like the classical ones. 
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w(k) 


w(x) 


Figure I1.5.10: Wave packets. We make a ‘gaussian’ superpo- 
sition of plane waves with momentum k given by p(k) (the blue 
curve). Then the wavefunction in x-space (the red curve) is as 
depicted at the bottom, and the enveloping curve of the wavy 
pattern is again a gaussian. 


Wave packets. This is certainly possible, as actually Schrö- 
dinger already pointed out. He studied what are called 

‘wave packets.’ These are smooth linear combinations of 

say momentum eigenstates, which are localized in both 

position and momentum space. These packets have an 

average momentum kọ and an average position and look 

in many aspects as extended particle-like objects. 


The starting point is simple, namely to look for states where 
the uncertainty in canonically conjugate (incompatible) vari- 
ables is minimized and balanced, respecting the Heisen- 
berg uncertainty relations. But because the Schrödinger 
equation is linear we may consider arbitrary linear combi- 
nations of the states, and are then time-dependent solu- 
tions because the different momentum components have 
different energies. 


Such a wave packet can be defined by specifying a func- 


tion p(k) and looking at the state 


hb) -| ip(k)Ik)dk. 


As the formula suggests the function is just the ‘ wavefunc- 
tion in momentum space’, as we may write: 


p(k) = (khp) . 


Now let us take a smooth gaussian (normal distribution) 
centered around some momentum ko,® 


plk) = (EE Mena to, 
T 


The factor in front makes sure that the state is properly 
normalized, so that all probabilities add up to one. We 
have displayed this function in Figure II.5.10 and indeed it 
is nicely peaked with a certain width around ko . 


Now we want to see what this package deal means for 
people who live in ordinary x or @ space. Using equation 
(11.5.6) we calculate: 


T (oh) = [Kolk kho) ak = 


= ({ 1 JIA eikop phe? /4a 


2na 
What do we see? First of all we see that the wavefunc- 
tion of the state is also gaussian in ọ space! That is nice 
because it does indeed mean that the packet is also well 
localized in position space, just as we wanted it. What 
we also see is that the width of that distribution is like the 
inverse of the width in momentum space. To be precise 
we have Ak = ,/1/2x, and Ag = \/«/2, which shows 
that the packet is optimal in the sense that it saturates the 
lower bound on the uncertainties imposed by Heisenberg: 
Ak Ag = 1/2. Finally we see that the wavefunction in 
position space also has a factor e**o” , which makes the 


8We discussed the gaussian or normal distribution in the Math Ex- 
cursion on probability and statistics at the end of Chapter 1.1. There it 
was also explained why this distribution pops up everywhere. 
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y(x t1) 


w(x,t>) 


Figure II.5.11: Wave packet dispersion. The time evolution of 
the free wave packet according to the Schrödinger equation has 
two generic features:(i) it moves forward with the group velocity 
of the package which is the effective velocity of the ‘particle’, and 
(ii) the package will broaden (disperse) over time. 


function periodic (and complex). The red curve depicted in 
the figure is (the real part of) the wave packet in coordinate 
space. 


Propagation and dispersion. Let us assume that the 
configuration at t = 0 is the one we just described, then 
it is interesting to see what happens in time exactly be- 
cause the packet is made up of momentum components 
that propagate with different soeeds. The question is there- 
fore what that means for the time evolution of the packet as 
a whole. We don’t want to go through the calculation here, 
but the important message is sketched in Figure lI.5.11. 
The first point to mention is that the center of the packet, 
or the envelope of the wavy pattern, moves with the so- 
called group velocity. This is a velocity which is differ- 
ent from the phase velocity of the individual momentum 
components. One typically sees the effect that the wavy 
pattern moves faster than the envelope and one sees the 
small wave appear on the left (increasing of amplitude) and 


disappear at the right (decreasing of amplitude) of the en- 
velope. The second point to mention is that the packet 
broadens in time, it disperses. If one calculates the prob- 
ability distribution p(@) the periodicity drops out and we 
get a pure gaussian that is broadening, as we displayed 
already in Figure Il.5.3 for a particle at rest. This disper- 
sion worried among others Schrédinger himself quite a bit, 
because it basically blocked a direct interpretation of the 
wave packet as a particle, which basically seemed to dis- 
integrate on quite short time scales. It was this aspect 
that was resolved by the probabilistic interpretation (the so- 
called Copenhagener Deutung) proposed by Born. 


Raising and lowering operators. I 
Let us briefly talk about yet another way to represent the 
general particle state (II.5.7), which utilizes ladder or rais- 
ing and lowering operators that are completely analogous 
to what we did for the qubit in (1I.2.13) and (Il.2.14). First 
we look for an operator that can step from a state |k) to 
|(k+1)) . Consider the following so-called step operators: 


t= eti® 


(11.5.20) 


where © is the coordinate operator given in (II.5.1) that 
satisfies D |@o) = Po |0}. Applying t+ yields 


; / J ; 
tk) = eri? x |e lp) de 
| F ; 
= Gi ent? lo) de = lk E 1) 


where we let the operators act on the state |@) in going 
from the first to the second line. So with t+ one may step 
through the spectrum. 


(11.5.21) 


This is not yet what we want; what we really want is oper- 
ators that start from some lowest energy states. The en- 
ergy of the free particle state to be equal Ex = py/2m = 
R?k2/2m then the lowest energy state is |0} with Eo = O. 
We like the right and left moving states to be generated 
from some lowest energy states. Consider then, instead of 
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Figure 1.5.12: Step operators. Action of the ladder or step 
operators at (raising-purple) and a (lowering-green) defined in 
equation (II.5.22) on the space of states labeled by |k}. There 
are two sectors: the right movers with k > 0 and the left movers 
with k < —1. 


the operators t4 , the two related ladder operators’: 


a=e'®p al = Pet, (11.5.22) 
these satisfy interesting commutation relations: 
[a, a] = [a at] = 0, and [a,a'] =2P+1. (11.5.23) 


Furthermore we see that for a free particle (with m=1) we 
can write the Hamiltonian as: 


1 1 
H= -aa = -P?. 


5 > (11.5.24) 


If we apply the operators to some state |k} , we obtain: 
al |k) = (k+ 1)ik +1), alk) =klk—1), 


which illustrates the fact that these operators basically raise 
or lower the momentum of the state by one unit. These 
constructions demonstrate two surprising properties of the 
states and operators. With these operators you can indeed 


?’In the remainder of this section we seth = 1, to keep the formulas 
simple. 


walk through the sample space of states but you will run 
into certain ‘no trespassing’ signs , where the next step 
you make you would let you disappear into nothing! The 
first one tells you that if you act with a you may come 
down from positive k al the way down to k = 0, but not 
any further because a|0) = 0. However if you start from 
[k = —1), then a will walk you down all the way to minus 
infinity. Something similar happens with at : it walks you 
up from any negative value until you hit |—1) where it halts, 
but starting at |0) it will bring you all the way to plus infin- 
ity. What this means that the spectrum naturally breaks up 
into two pieces: one of which you could define as the right 
movers with k > 0 and the other as the left movers with 
k<0. 


State operators. We can now also construct operators 
that directly create any momentum state from the ground- 
state. For example the state |k) can be obtained by acting 
k times with at on the ground state |0) , as the following 
calculation shows: 


(al) 
k! 


fi 

a 

k) = —|k-—1) =... = : 

Ik) = 2 Ik—1) 0) 

The general state |) could also be symmetrically repre- 
sented like an operator : 


IY) = W |0) = (xo + Lear J (oa + xb") |0), 
(1.5.25) 


where we have defined what you could call a ‘particle- 
state’ operator Y and b is a shifted operator b = e *?(P — 
1). The correspondence between states and these type 
of step operators acting on a ‘vacuum’ or ‘ground’ state 
will be of great use if we move from quantum particles to 
quantum fields as we will do in the next section. E 
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The harmonic oscillator 


Oscillators everywhere! The harmonic oscillator or the 
‘particle in a harmonic oscillator potential’ is a system that 
is treated extensively in any book on quantum physics and 
classical physics alike. In spite of the fact that we do not 
see swinging pendulums all over the place, the simple truth 
is that the world around us is actually largely made up of 
oscillators! One way to understand that is to realize that 
most ‘things’ are in a state of equilibrium, in other words 
they are in a state of minimum energy. And yes, if you per- 
turb a system in equilibrium, it will start to oscillate about 
its equilibrium state. You knock on the table, you drop a 
stone in the lake, the days, the seasons, economic cycles, 
the orientation of the Earth’s axis, the strings of your guitar 
and of string theory, the rhythms of life: all are oscillatory 
motions in some suitable space. 


So imagine the horizontal axis describing the displacement 
of some relevant variable from equilibrium, and let us call 
that variable, yes indeed, x, then along the vertical axis 
we plot the energy V (as a function of x). This function 
generically will have a particular shape. It will have a min- 
imum at x = 0, and if we think that we study small per- 
turbations we might look at V(x) close to the origin and 
describe it effectively as an expansion in (positive) pow- 
ers of x. The first term would be linear, but that could not 
be because it would not correspond to a minimum any- 
more, the minimum would have shifted away. So the first 
relevant term would be the quadratic term which we write 
as V(x) = }w*x?. You get the bowl-shaped potential de- 
picted in Figure II.5.14. In Newtonian mechanics this would 
imply a force F = —dV/dx = —w?x, thus a linear force 
trying to move the system back to the equilibrium position. 
This is not surprisingly called a harmonic force. As you can 
think of a marble rolling forth and back in the bowl. We dis- 
cussed this dynamical system at length in Chapter 1.1. It is 
important to now look at quantum oscillators because the 
microscopic world is also beset with them. This is a model 


Figure 11.5.13: The stepwell of Chand Baori. These remark- 
able stepwells in India were once used to store water. Chand 
Baori is made up of 3.500 steps over 13 stories. The steps look 
like states forming a discrete spectrum of some quantum sys- 
tem.(Source: Wikimedia.) 


system that at first looks like one of these totally boring 
academic, dry-nerd-drill-home-trainer kind of things. The 
deadliest didactic horse ever. No! Imagine, its applications 
on all rungs of the quantum ladder are quite stunning and 
we will come across a few of them. So, please stay with 
me for this one. 


If we return to basics, our starting point is the simple Ha- 
miltonian for a unit mass particle in a harmonic poten- 
tial: 


H= Mp? + wrx"), (11.5.26) 


The classical equations are, 


dx dp 2 

Tt =P, At =—-Wx. 
We will treat these equations in the Heisenberg picture 
meaning that we have time dependent operators X(t) and 
P(t) with the canonical commutation relations:!° [X, P] = 


10 if we postulate them at t = 0, the unitary time evolution ensures 
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Figure 1I.5.14: Harmonic oscillator. Action of the ladder or 
step operators a! (raising) and a (lowering) defined in equation 
(11.5.29) on the space of energy eigenstates |n). The ground 
state |0) has energy Eo = 1/2w. 


[X(t), P(t)] = ih. Interestingly the equations can then 
be solved by using (commutator) algebra only. These are 
coupled equations, and you can decouple them by using 
the complex linear combinations which are as we will see 
raising and lowering operators: 


| z 
a(t) = 4/ — (wX + iP) =X +iP (11.5.27) 
2w 
1 E 
at (t) = 4/— (wX — iP) = X — i (11.5.28) 
2w 
The solutions have simple phases: 
a(t)=ae*  at(t) = aet ®t, (11.5.29) 
these satisfy simple commutation relations: 
[a, a] = [at a] = 0, and [a,a‘] =1. (11.5.30) 


Furthermore we see that for the particle in a harmonic po- 
tential we can write the Hamiltonian as: 


H=w(alat3). (11.5.31) 


that they remain valid over time. 


These operators raise or lower the energy of energy eigen- 
state with one step. This follows from the commutation re- 
lations: 


[H, a] =—wa, 


[H, at] = +wal. 


(II.5.32) 
(11.5.33) 


Let us define the eigenstates |n) of the Hamiltonian as 


H m) = Enr h), (11.5.34) 


then with (11.5.32), we obtain that applying at to a state |n) , 
creates the state |n + 1) , because 


H {al|n)} = (En + w){al|n)}, 


and similarly for |a) with a minus sign on the right-hand 
side of the equations. Now we can see what we have 
gained with these manipulations. First we better assume 
that there is a lowest energy state |0) and as the energy 
cannot be lower we have to assume that the lowering op- 
erator gives zero when acting on this state: 


al0) =0, (11.5.35) 
and thus: i 
H |0) = z” I0). 
From this one can show other quantessential properties: 
En = (n + l ) 
m S 3 w, 
and, , 
(at)" 
In) = |0) . 
) aa ) 


The results are summarized in Figure ll.5.14. There are 
a few points worth mentioning. Firstly, the spectrum is 
equally spaced, and we have degenerate left and right 
movers. So it easy to construct raising and lowering opera- 
tors. It is worth mentioning here already that later on in this 
chapter we will see an application of the oscillator algebra 
in field theory, where the operators at and a do not move 
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you through the spectrum of states of a single particle, but 
rather they act as creation and annihilation operators of 
particles in a given state, acting on a multi-particle Hilbert 
space. The second point is that the ground state has a 
non-vanishing ‘zero point energy’ equal lw, which basi- 
cally follows from the uncertainty relations which do not 
allow the quantum particle be at rest at the bottom of the 
potential. The momentum (energy) cannot be zero. And 
indeed if you think of a table which is made up of zillions 
(or better 107° or so) of oscillating particles you may won- 
der about the energy that appears to be just sitting there. 
In the cellar as it were, an incredible amount of vacuum, 
energy. What if....2 May be we should just be cavalier 
about it and put in the same category as our friend the 
‘filled Dirac sea’, where the physics basically only starts 
once you are on top, at the surface. 


Constructing the wavefunctions. The explicit expres- 
sions for the wavefunctions n(x) = (x|n) are most easily 
obtained recursively starting from the ground state. The 
ground state wavefunction can be constructed by solving 
the equation (II.5.35) as follows: 


al0) =0 
=> oe EA 
dx 


This is a differential equation with the (normalized) gaus- 
sian solution: 

(2 Me oe . 

TT 


Wo(x) = 


The higher states are obtained by repeatedly applying the 
raising operator at = Vox — d/dx) on this ground 
state. So one just has to differentiate the ground state 
which is relatively easy to do. The resulting wavefunctions 
UWn(x) = (xin) for the lowest n values were already dis- 
played in Figure ll.5.4 on page 383. 


ISS 
Let us return to the question of constructing quantum states 
that do look like a classical particle . These correspond to 
a wave packet, where we start combining waves in such a 
way that they have a reasonable width both in momentum 
and position space. We look for states that have a minimal 
spread about the average values of the variables, thereby 
making the uncertainty around a corresponding point in 
classical phase space in all directions as small as possi- 
ble. Such states were already considered by Schrödinger 
and are nowadays called coherent states. They represent 
a wide class of states that just like the oscillator system 
have found many applications. These vary from quan- 
tum mechanics, optics, quantum chemistry, atomic phys- 
ics, statistical physics, nuclear physics, particle physics, 
quantum information theory, group theory, and cosmology, 
to mention a few. 


Coherent states 


Let us now apply this idea to the states of a particle in 
the harmonic oscillator potential. We introduced the clas- 
sical version of the harmonic oscillator already in the first 
chapter of Volume | on page 14. The periodic motion in 
configuration space that corresponds to a circular motion 
in phase space is characteristic. We now want to construct 
quantum states that show similar behavior. These cannot 
be the stationary energy eigenstates we have just been 
constructing in this subsection. 


Minimal uncertainty states. From the commutator, 
X, P] = ih, 


directly follows the standard form of the uncertainty rela- 
tion: 


A(X) A(P) > . (11.5.36) 


What we would like to find is a state where we have that 


A(X) = A(Î) =A, 


^A = 
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The states that achieve this are eigenstates of the lowering 
operator a, so we have: 
alà) = AJA), (11.5.37) 


these eigenstates have the basic property that A = (Alalà) , 
but also that: 


O Ww i 1 o 2 
(E) = (MEA) = F Alata + 5J) = (IAP +1). 


Let us pause here for an instant. The question here is 
not to construct a ground state or an eigenstate of a given 
Hamiltonian, it rather is to construct eigenstates of the an- 
nihilation operator, with some eigenvalue A. This problem 
is analogous to the construction of the translation operator 
that we discussed in equation (II.6.6). If we have an eigen- 
function of position, where the expectation value of x is 
given by zero, we may apply the translation operator T(a) 
to it, and shift the argument of the wavefunction so that 
w(x) > w(x+a). Then the vacuum expectation value will 
shift to (x) = —a. And indeed the procedure is closely re- 
lated, the desired state can be made out of the vacuum by 
a ‘translation’ operator built from the conjugate variable, in 
this case not the translation generated by the momentum 
P, but by at : 


IA) = eò®™ 0) . (II.5.38) 


So, if we write a = (X + iP), then such states have the 
property that: 


(X+4P)|A) =((X)+i(P))IA) =AlA). (11.5.39) 


Because for an eigenstate, the expectation value of the 
operator is equal to the eigenvalue. Bringing terms to the 
other side we obtain that: 


IX—(X)| =|P—(P)I, 


which establishes that the variances are equal: A(X) = 


Quantum coherent state 


Trrrtrt. 


Classical particle 


Figure II.5.15: A fuzzy particle. Phase space picture of the co- 
herent state wavepacket with its fixed uncertainties for large val- 
ues of A. The coherent state x and p expectation values follow 
the classical trajectories but they carry a disk of uncertainties 
with diameter h/2 along. 


A(P) = A. From the equation (1.5.39) 
(ata) = A7 = ((X-i 
= ((X? + P? + ix, PI) = 
= (KX?) + (P3) —h. 


W 
+ 
ae 
W 

II 


Taking the absolute square of equation (II.5.39), which con- 
tains the expectation values. This gives the result: 


N= (XP + (BY, 


Combining the two previous results we obtain the equation 
for the sum of the variances: 


A(X)? + A(P)? = 24? =h, 


giving A? = K/2 which is the minimum value allowed. 


A fuzzy particle. What have we learned? Firstly that it 
is indeed possible to construct wave packets or coherent 
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states in which the uncertainties in position p and momen- 
tum n match. In fact we found a continuum of different 
states |A) that satisfy those conditions, and these states 
are labeled by the real parameter A . Secondly we saw that 
average momentum is of order lambda, while the width 
of the momentum distribution in such a state is fixed and 
equal sh. This means that if we increase A the probability 
cloud of the particle becomes relatively narrow. The result- 
ing overall picture is displayed in Figure II.5.15. The radial 
direction is the ‘a’ or therefore A axis with real component 
x and imaginary component p. The time dependence of 
a(t) is a(t) = aexp(iwt) as given in equation (11.5.29), so 
wt is the angular variable in the figure. The resulting ex- 
pectation values (x(t)) and (p(t)) describe the same tra- 
jectory in phase space as the classical particle would do. 
The classical periodic motion was depicted in Figure II.5.1 
and the corresponding circular motion in phase space in 
Figure II.5.2 on page 380. We have emphasized that the 
uncertainties in postion and momentum are fixed and in- 
dependent of A, which means that the approximation of 
the classical picture improves if we increase A. This ba- 
sically corresponds to the limit of high momentum or en- 
ergy levels, where you would indeed expect classical be- 
havior because the energies are large compared to the 
ground state level. However, note that as a function of 
time, the packet will broaden because the various momen- 
tum components move at different velocities. We depicted 
this type of broadening as a function of time in Figure II.5.3. 
EE 


The energy spectrum of coherent states. III 
In this final paragraph of this section we show what the 
states |A) look like if we decompose them in energy eigen- 
states. To do so we use a cute little trick. Note that the 
a operator, because of the commutation relation with at, 
can be thought of as differentiation with respect to at . This 
means that we can write: 


(at)" 


Gal 


(ati 


val 


0) = ni 0) = vnm- 1). 


aln) = a( 


Figure 11.5.16: Coherent states. The probability distributions 
pa(n) given in equation (II.5.40) for finding energy n ~ A? ina 
coherent state |A} for A* = 1,..., 10. 


The states |A) can be obtained by finding a recursion re- 
lation for the coefficients «, in (II.5.25) by imposing the 
defining equation (II.5.37): 


a) onln)) = )_ anvnm-1) =A} anin)). 
n=0 n=0 


Matching corresponding components we obtain the recur- 
sion relation: 


with the normalization constant!! N = exp(—|A\?/2). So 
what we have constructed here are coherent states para- 
metrized by a parameter A which have minimal and equal 
uncertainties for both conjugate phase space variables. 


"Normalization of the state gives: 
(AIA) = N? F pao IA?™/n! = NZ exp(lA*) =1. 
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These states have many momentum components; in fact 
we can calculate the energy distribution, for large A it be- 
comes: 


_ 2_ [AY ae 
pln) = nlà) = “rie (11.5.40) 
These are so-called Poisson distributions and we have plot- 
ted them in Figure 11.5.16 for values A? = 1,...,10. We 
easily calculate the average : 


(n) = Allà) = (ala) = A7. (11.5.41) 


whereas for the average of n? we obtain: 
(n?) = (ataata) = ((a1) a? + ataja? = A 427. 
Combining the two we find for the variance: 


(An)? = (n?) — (n)? = 77. (11.5.42) 
We see that the the average of n is proportional to A? while 
the width of the distribution goes like A. This means that for 
increasing A, the distribution relatively narrows . This is of 
course consistent with our calculation from the uncertainty 
relation (II.5.36) where we found the same variance. The 
resulting situation is summarized in Figure 11.5.15. E E E 


Fields: particle species 


In this section on quantum fields we bring together a num- 
ber of insights that we have touched upon in previous chap- 
ters. When saying field theory, we start by thinking about 
free fields, these are described for example by the Maxwell 
equations, the Klein—Gordon or the Dirac equation. All of 
them are relativistic wave equations and the question is 
what it means to quantize them. 


Let us make some observations first. 
(i) Fields are defined over all of space and they typically 


Figure 11.5.17: A 1962 conversation between Dirac (left) and 
Feynman (right) at a conference in Warsaw. (Source: Courtesy of 
Caltec Photo Archives.) 


have an infinite number of degrees of freedom, and in that 
sense you can think of them as equivalent to an infinite 
number of particles. 

(ii) You can think of the fields as being the generalized 
coordinates, meaning to say that the configuration space 
which for a single particle is just ‘x’-space is now the space 
of field configurations. 

(iii) In Chapter 1.1 we have shown that for a field like the 
electromagnetic field we can define an energy and a mo- 
mentum density and the logic of field quantization is to run 
the same program as before, and impose canonical quan- 
tization conditions for fields (as coordinates) and their as- 
sociated momenta. 
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Thie procedure is quite involved and it took about thirty 
years before the first consistent field theory named Quan- 
tum electrodynamics (QED) was completed. 


Field quantization. Without going through any calcula- 
tions, which are generally quite messy and extensive, let 
me nevertheless give you some feeling for the results which 
are strikingly simple and beautiful.!* And to transcend the 
swamp of words let me take the example of the simple 
scalar particle described by the Klein—Gordon field (x, t), 
which has to satisfy the relativistic equation 


(A+ m)o(x") =0. 


This has solutions which can be expanded as a sum of 
plane wave solutions with coefficients a and a* look like: 


] an 
(x4) =N [a,e" + axe 
CONS ame 


ike 
| 


) 


with the definitions x" = (ct,x) and k# = (wk, k) and 
moreover the K-G equation imposes wp = vk? + m2. 
The coefficients have to be each others complex conju- 
gates to make the field real. In this case the momentum 
field would just be n(x!) = dd/dt which is indeed the time 
derivative of the ‘coordinate’ field. 


Oscillators once more. In the present context the fields 
are the observables! So the quantum fields are operators, 
and as they are time-dependent, they are Heisenberg type 
operators. What that means is that in the above expression 
which is called mode expansion, the field on left-hand 
side becomes an operator, and on the right-hand side the 
operator property is carried by the coefficients. The modes 
are just the classical plain waves multiplied by operator co- 
efficients a, and their conjugates al. These act now like 
creation and annihilation operators. Performing the calcu- 
lational gymnastics of imposing the commutation relations 
for the fields and zr in the end boils down to commutation 


21m this section we have seth = c = 1 for convenience. 


relations between the operator coefficients. The upshot is 
surprisingly simple: 


[az, ax] = lal, aj.) =0, (11.5.43) 


CA (11.5.44) 


But now the air clears up! Compare this result with the 
commutation relations in (11.5.30). What have we got? We 
have obtained an infinite number of harmonic oscillators, 
each labeled by a momentum vector k , and having a fre- 
quency wx. So, one (free) quantum field is equivalent to 
an infinity of oscillators and that rings an infinity of bells. 
The energy or Hamiltonian H of the field is not so surpris- 
ing: i 
Ea 

with Nk = a ax , the so-called number operator. There is 
also a total momentum vector P = {H, P} for the field: 


1 
P=} ky(Nk + 5)- 
k 


The above equations naturally combine in an energy-mo- 
mentum four vector P,, for the field. 


Multi-particle Hilbert space. And what does the Hilbert 
space for such a free field look like? Well, first we define 
a vacuum state |0) with the defining property that is an- 
nihilated by all a, operators. Now we act with a creation 
operator on the vacuum: 


al |0) = m) with n = 1. 


This means that we have made a step in energy of E = 
hw = y (mc?)? + (Akc)2, where | have put the constants 
back in. That energy corresponds exactly to the relativistic 
energy of a single particle of mass m with energy E = hw, 
and momentum p = hk. So, we are not raising the energy 
of a single particle. No, every time we work with an at 
operator we create an additional particle of the type de- 
scribed by the field in the corresponding momentum state. 
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low momentum modes high 


Figure 11.5.18: Quantum field modes. A quantum state of a 
quantum field is labeled by the energy-momentum (k) modes of 
the single particle, and the number of particles nx that are in 
that mode. 


And the annihilation operator does exactly the opposite. 
How charming and quantessential: the same algebra in 
another context creates another reality! The upshot is that 
we have a multi-particle Hilbert space, often called Fock 
space, with states, 


Knk}, with N Hn} = nk nk} - 


We have ended up with a clip and clear framework in- 
deed. 


The Klein—Gordon field is the simplest one to think of be- 
cause it is just a field with one real component, but what 
about the other fields, like the Maxwell and Dirac field? Yes 
and no, their quantization is both similar but at the same 
time very different, also because their classical content is 
very different. In the Dirac case we have to understand 
what it means to have the Dirac sea and how to implement 
the anti-particles. Now the basic relations for the operators 


are anti-commutation relations, 


{Ds,p 5 oe ier a 
{bs spo D a= = B5550 


(II.5.45) 


ro (11.5.46) 


and an identical set for the anti-particle creation and anni- 
hilation operators dg and ds». The index s denotes the 
spin state of the (anti-)particle. The anti-commutator is de- 
fined as the symmetric product, for example: 


{bip by np} = bip» bip + Pips bip. 


This definition has a profound implication that becomes 
manifest if you look let the equation for a vanishing com- 
mutator work on the vacuum. It yields the result, 


bi i" p/l0) = 


sp > P bl ‘sp’? bipi. 


The two-particle states on the right and left have two parti- 
cles in the same individual states but they are interchanged. 
We have interchanged two identical particles and that gives 
a crucial minus sign because of the anti-commutators. The 
relation with the Pauli principle becomes even more direct 
if you put p’ = p and s’ = s, because then you get that 
that particular state equals minus itself, which means that 
that state is equal to zero! It says that such a state is 
just not there. It is not the ground state but a true no- 
state: a clearer statement of exclusion is hardly imagin- 
able! With the Dirac equation everything fell into place: 
the spin appeared as necessary ingredient, along with the 
exclusion principle after the correct quantization. And then 
anti-matter as a bonus. How delightful! For the Maxwell 
field, it is the gauge invariance which has caused some 
profound headaches. But today all these difficulties have 
been overcome, and these type of (gauge) fields and their 
quantization form the basis of a consistent description of 
all particles carrying forces or interactions in the Standard 
Model. 


Interactions. Of course if we discuss quantum field the- 
ory there is more than the quantization of free fields, itis a 
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multi-particle framework but the all-important interactions 
are left out. Isn’t this about throwing out babies with the 
bathing water? No! This is a basic framework that is an 
absolutely vital starting point for any further going discus- 
sion. 


Perturbative approaches. We have in Chapter ll.1 al- 
ready described some of the interactions that are present 
in the standard model. The basic interactions are charac- 
terized by certain interaction vertices, diagrams where dif- 
ferent particles interact at a given space-time point. That 
point is where particles are annihilated and created in par- 
ticular states that ensure that all the conservation laws like 
energy, momentum or charge, are respected. Each com- 
plete diagram then contributes to the overall probability 
amplitude for the process to take place. 


This approach is called a perturbative approach, which is 
an iterative procedure to get ever better results, because in 
the calculations you include more and more complicated, 
higher-order diagrams. And as long as the coupling con- 
stant is small — and for QED for example the coupling 
strength is « = e? /4rħc ~ 1/137 — the higher order terms 
become tiny. 


This way relatively low-order calculations already give in- 
credibly accurate answers. And this scheme has led to the 
spectacular demonstrations of the power of quantum field 
theory, as for example in the calculation of the anomalous 
magnetic moments of the electron and the muon. The cal- 
culations are up to fourth order in «, and coincide with 
the best observed values up to 10 significant digits. This 
makes it the most accurately verified prediction in the his- 
tory of physics! 


Beyond perturbation theory. But in many situations it is 
necessary to go beyond perturbation theory. If either the 
particle density is large, or if the temperature gets very low, 
or the interactions become strong, one needs other ap- 
proaches. And in the past century a lot of progress 


The other currency 


G: Hey Orange, | really like the stuff you told me 
about Dirac. 

O: I am happy you liked it, Green. But you are right, 
he’s a kind of a genius! 

G: Yeah. Thats what | thought, but more an 
anti-genius may be, chr chrr chrr! 

O: He must have been very happy, with making dis- 
coveries of such profound importance for mankind. 
G: Yeah. Hey Orange, | presume he must have 
become very, very rich. 

O: You mean like Bill Gates or Warren Buffett. 

G: or Prince or Picasso? 

O: or Irving Stone or... 

G: or Oprah! 

O:Yes, you would think so Green. But no, | have to 
disappoint you. 

G: But Orange, if you do such great works... 

O:It didn’t happen. 

G: You mean that others have stolen his ideas? 

O: No Green, it is not that. You have to understand 
Green, for scientific achievements like Einstein’s of 
Dirac’s or Heisenberg’s there are no rights. 

G: Are you telling me that they forgot to manage 
their copyrights or patents? These brilliant men 
didn’t do their homework, is that it, chr chrr chrr. 

O: Quiet down Green. Respect! Let me tell you 
this: a formula isn’t like a novel, or a song, or 
baseball game, or a paperclip, or a diesel engine, 
or a talk show. 

G: Are you saying that in the big scheme of things 
it is just marginal. 
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O: Yes indeed, Green, thank you. Now you under- 
stand what | mean. 

G: Thank you Orange, | think | am going to have a 
peanut butter jelly sandwich! A Schrédinger-Dirac-- 
Heisenberg sandwich! chr chrr chrrr. 

O: Green! Listen, the scientist have another type of 
currency. 

G: Like bitcoins? 

O: Yes Green, but they call them citations. 

G: What do those buy you? 

O: Well, you know, Green, you know this game 
called monopoly? You can make a lot of money ... 
G: | am getting really hungry. Thanks Orange. L 


has been made in developing alternative non-perturbative 
ways of using field theory. We will discuss some important 
examples in the context of condensed matter physics in 
Chapter IlI.3. 


Often situations where perturbation theory breaks down 
have to do with identifying some highly non-trivial ground 
state and start from there. For example it may be that a 
certain particle-type will condense in the ground state, so 
that it is no longer an eigenstate of the number operators 
N, x - In fact one finds that some number density operator 
has a non-vanishing expectation value in the new ground 
state. The ground state of the super conductor is a canon- 
ical and beautiful example. 


The phenomenon of superconductivity was discovered by 
Kamerlingh Onnes, but It took more than half a century 
to arrive at a really deep understanding of the underlying 
mechanism. Among other things the message to science 
seemed to be: ‘Never give up!’ 


Let us briefly indicate what it means that the ground state 
of a physical system is characterized by some condensate. 


Think of the electrons in a conductor: they interact over rel- 
atively long distances via the lattice vibrations, which after 
quantization go under the name phonons. This phonon 
induced interaction between the electrons turns out to be 
attractive, and leads to a pairwise binding of the electrons 
of opposite spin and momentum. The electrons form so- 
called Cooper pairs. These pairs having spin equal zero, 
are of course bosons and therefore they can all condense 
in the same state. Indeed the ground state is a coher- 
ent state of Cooper pairs, which can be thought of as a 
linear combination of states with all possible different num- 
bers of pairs in it. The system gains an enormous energy 
by dropping in this ground state, because the exclusion 
principle had pushed the individual electrons up to quite 
high energies. And starting from this ground state one 
has been able to prove all relevant properties of supercon- 
ductors, using the successful BCS theory developed by 
the American physicists, John Bardeen, Neil Cooper and 
Robert Schrieffer, who received the Physics Nobel prize in 
1964. 


Ground states as coherent states. This situation is simi- 
lar to the one we encountered in the previous section about 
the harmonic oscillator, looking at the phenomena of co- 
herent states. In view of the almost uncomfortably close 
analogies between field theory and simple oscillators, it 
is imperative to ask about coherent states in field theory. 
What do they look like and what would the physics be 
like? Multi-particle coherence! What kind of bulk proper- 
ties would that correspond to? And what low energy exci- 
tations would be there? Do we recognize them? What are 
interactions those ‘trivial’ agents could have engaged in, to 
give rise to such weird states? Here we enter a domain of 
what P.W. Anderson so beautifully characterized as ‘more 
is different. Many identical particles can, because of the 
interactions they have, give rise to highly non-trivial, highly 
diverse — but also highly non-recognizable — forms of col- 
lective behavior. Just like people, | am tempted to say. We 
have already encountered some of them, like quark con- 
finement and the Higgs mechanism, but in the final part 
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of the book on structural hierarchies we will discuss many 
complex collective manifestations that emerge from the as- 
tonishing simplicity which we have exhibited here. The rich 
diversity of the condensed states of matter is the smashing 
consequence of having simple basic agents with simple 
basic interactions. 


Particle spin and statistics 


A quantessential principle with tremendous explanatory 
power is Wolfgang Pauli’s exclusion principle, decreeing 
that two or more Dirac-type particles (like electrons, neu- 
trino’s, or quarks) cannot occupy the same quantum state. 
Not all particles obey the principle, but if the particle does, 
it is called a fermion, and it also needs to have half-integral 
spin, just like the usual fermions described by a Dirac like 
equation. In this section we discuss a more direct and 
therefore more accessible approach to quantum statisti- 
cal properties, based on the topology of the two-particle 
configuration space. The discourse is systematically built 
up, starting from the notions of indistinguishability and ex- 
clusion to describing particle interchange and the spin- 
statistics connection. 


Indistinguishability 


In quantum field theory, the loss of particle identity 
is inevitable 


In quantum field theory the states correspond in general to 
many-particle states. These states are described by one 
field, or wavefunction, and this implies that individual par- 
ticles are no longer distinguishable entities. A severe loss 
of identity in the quantum world. It is a world where only 
family names exist; first names are just not there. 


Figure 11.5.19: The Encounter. This magical etching of Maurits 
Escher’s was made in 1944. (© 2023 The M.C. Escher Company.) 


The fact that multi-particle states are related to a single 
field implies an additional property, namely, that the corre- 
sponding particles loose their individuality. Individual parti- 
cles of a given type, described by one type of field become 
indistinguishable. It may be that some state of an elec- 
tron field describes two electrons, one electron in state A 
and one in state B, but you cannot say that particle 1 sits 
in A and particle 2 sits in B. They are like identical twins 
carrying a family name only but no first name. There is 
no ‘John is at home’ and ‘Peter is at school’, even though 
you can say that one is at home and the other at school. 
There is no ‘who is who’ in electron land (what a relief!), 
just strict anonymity and for that matter perfect democracy. 
Particles have a family name only. It may remind you of ex- 
tremely strict school outfit rules: identical uniforms, iden- 
tical shoes, and identical haircuts, in an attempt to wash 
away individual differences. Not my cup of tea. Anyway, 
this severe quantum loss of identity affects the counting of 
the available number of ‘different states’, and therefore the 
statistics properties of ensembles of such quantum par- 
ticles. The statistical properties of the particles in turn 
are quantessential for understanding their collective be- 
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havior. 


We will later return to the basic reason for that loss of in- 
dividuality, being that all multi-particle/antiparticle states of 
a given species correspond to states of a single field de- 
scribing that species, say the electron field or the photon 
field. 


Exclusion 


We have seen that quantization basically implies the study 
of wavefunctions of the classical configuration space. So 
we want to just focus on the special case that is of partic- 
ular interest. Imagine that we have two particles that are 
‘identical’, meaning that they are indistinguishable. These 
two-particles states are described by a single wavefunc- 
tion defined on the two-particle configuration space, de- 
pending on the two position coordinates x; and x2. But 
the indistinguishability of the particles implies that certain 
configurations which look different at first have to be identi- 
fied. If somebody asks us to count the number of different 
(distinguishable) states, then we have to identify all con- 
figurations where the positions of identical particles are 
interchanged. Again, it’s like a class where we have an 
identical twin, and we ask on how many different class 
configurations there are. Assuming that the twins are in- 
deed indistinguishable by all means, we would have to 
count the state where twin A is in the front row and twin 
B in the back row and the configuration where they have 
switched places, as one and the same configuration. You 
see that the condition of indistinguishability affects the way 
we count the number of possible states, and therefore what 
the statistical weights are that we have to assign for certain 
configurations to occur. 


There is however another important distinction we want to 
make right from the start. We may want to implement an 
exclusion rule saying that twins are not allowed to sit on the 


same chair. They may like each other but their sympathy 
is limited and sitting on the same chair is just out of the 
question. A rare occasion where the teacher and the twins 
seem to fully agree! Back to identical and indistinguishable 
particles, imagine the first particle has coordinate x; and 
the second x2. The quantum state is then described by 
a two-particle wavefunction w(x1, x2) depending on both 
coordinates. The question is now what we can say about 
the wavefunction if the two particles get interchanged, i.e. 
wW(x1, x2)  wW(x2, x1). Yes, their configuration is identical 
in that there is no experiment that can distinguish the two 
situations from each other — the usual nightmare for all 
twins. But does that imply that the wavefunctions have to 
be strictly equal? That’s the question. 


Unobservable phases? Taking into account all lessons 
we have been exposed to so far, we can say that the two 
wavefunctions can only differ by a subtle attribute that is 
not observable, namely the overall phase. It is subtle and 
seems completely innocuous but as we will see it is of cru- 
cial importance. This sounds indeed paradoxical, a sup- 
posedly unobservable phase that manifests itself. Let us 
first give the argument the naive and sloppy way, and say 
that the wavefunctions differ by a phase factor: 


W(x2, x1) = eM tp(x1, x2). 


We expect that if we interchange them once more we will 
get back to the original state, from which it follows that we 
have to demand that: 


and this constraint has two solutions (modulo 27) « = 0 
and « = 7. This in turn implies that there are two differ- 
ent solutions for the wavefunction under interchange of two 
identical particles: 


wW(x2, x1) = £(x1, x2), 


implying that the wavefunctions are either symmetric or an- 
tisymmetric under the interchange. And indeed the parti- 
cles that obey the symmetric rule are called bosons, the 
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antisymmetric guys are called fermions. We see that the 
antisymmetric solution implies that the particles cannot sit 
in the same spot, because if so that wavefunction would 
have to satisfy w(x, x) = —wp(x, x) implying that p(x, x) = 
0! This ‘unobservable’ phase has huge quite observable 
consequences! This is so because the origin of this type 
of phase is topological. 


Because of the indistinguishability requirement, the Hilbert 
space of two-particle states breaks up in two disconnected 
pieces being the even and odd functions. The phase is 
not the overall phase but the phase acquired under the 
interchange operation, and indeed the interchange should 
not change the observable probability distribution, which it 
doesn't. 


Apparently fermionic particles obey an exclusion principle 
and such particles behave physically totally different from 
their bosonic counterparts, who are not subject to this ex- 
clusion principle and may like to hang out in the same spot. 
Indeed, they do like to sit on top of each other if it gets re- 
ally cold! 


The topology of particle exchange 


Two-particle configuration space. It will turn out that 
the possibility of non-trivial quantum statistics is directly 
linked to the connectivity properties of the configuration 
space of two identical particles and the topology of parti- 
cle exchange. It is therefore worth considering in more de- 
tail what this ‘two-particle configuration space’ really looks 
like. 


We start by taking two coordinates x; and x2 which take 
values in some ordinary space M ~ R? for example. In- 
stead of choosing x; and x2 we may also choose as co- 
ordinates the ‘center of mass’ coordinate X = (x; + x2)/2 
and the ‘relative coordinate’ x = (x; — x2)/2 . During inter- 
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Figure 1I.5.20: Shinkichi Tajiri: Meandering paths (1997) ‘Me- 
andering paths, unavoidably returning to an empty shell?’ Look- 
ing at this work from a quantum perspective it depicts the entan- 
gled world-lines of particle pairs, first created and later annihi- 
lated. Indeed, the net effect is a transformation of the vacuum 
state. (Source: info@tajiri.nl.) 


change we may keep X fixed (the origin, say), for example 
by moving the two particles around the center of mass that 
is located exactly half way between them. The interchange 
xı © x2 Corresponds to a move from x + —x while keep- 
ing X fixed. So, we are left with studying the ‘x’ space. This 
space is again a copy of M , but not quite, because in this 
space points, that are mirror images through the origin of 
each other, meaning the points x and —x have to be identi- 
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fied if the particles are indistinguishable. Furthermore, the 
physical interchange in ordinary space corresponds to a 
closed loop in this reduced x space. 


Three or more dimensions. We can take care of this 
doubling by cutting the space in half,!> so we take away 
the bottom half of the space, say, all points with z negative 
(z < 0), as we have indicated in Figure II.5.21. This solves 
the problem almost but not quite, because the space we 
are left with has acquired a bottom where strange things 
still happen. Indeed, in the bottom z = 0 plane we still 
have to identify the mirror points. But now this is at least 
something we can do ‘by hand’. 


Connectivity. The connectivity of the space is determined 
by studying the classes of possible loops in the space. Let 
us first discuss that and then return to the question of in- 
terchanges. In Figure II.5.21 | have drawn two paths. The 
first one is the green loop denoted o , which is a loop that 
can smoothly be contracted to the red base point, hence 
it is the ‘trivial’ loop. This trivial loop means that there is 
basically no exchange and therefore the phase of the two 
particle state cannot change, so we conclude that o = +1. 
The second red curve is again a closed loop because the 
beginning and endpoint are the same point, but now we 
can not contract the loop. The smooth deformations can 
only involve motions of the pair of red points into other mir- 
ror pairs in the bottom plane, if you were to lift them out of 
the plane they would no longer be the same point, and you 
would cut the loop — not so much a smooth deformation 
rather a killer move. And you cannot bring them together 
through the origin, because that point is taken out. So, the 
red loop is truly non-contractable and clearly belongs to a 
different topological class. We conclude that the reduced 
space clearly has some ‘nontrivial’ topology. The question 
is to find out what values the phase t could take. 


13‘Cutting the space in half’ is not a typical act that experimentalist 
can perform. The point is that to make the topological argument we can 
do this in our head to simplify our analysis without loss of generality. 


Figure 11.5.21: Topology of two-particle configuration space. 
The two-particle configuration space, is R* but with the bottom 
half and the origin removed. And on the z = 0 plane a point 
and its mirror image through the origin are identified. So there 
are two inequivalent types of closed paths possible. The green 
loop, which is contractable to a point, belongs to the trivial class; 
o = 1. The red path, which is also closed but not contractable, 
belongs to the other, non-trivial class. 


Interchanges As we said already an interchange x1 © x2 
corresponds to a move from x + —x. Furthermore the 
path connecting the two points in x-space is not allowed 
to pass through the origin, because then they would meet 
at the same point and we would like to allow for an exclu- 
sion principle. An admissible move is depicted in the top 
graph of Figure II.5.22. In the reduced x—space this inter- 
change is schematically depicted in the lower graph of the 
figure. We do allow the wavefunction to acquire some con- 
stant phase factor t and that factor cannot change under 
a continuous deformation of the path from x to —x through 
x-space. This means that the admissible phases Tt label 
the different topological classes of closed paths that are 
possible in x—space. We have discussed these classes 
before, on page 83 of Chapter 1.2, and learned that these 
are called homotopy classes. 
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x-space 


reduced x-space 


Figure II.5.22: Interchange. Particle interchange denoted by t, 
in the space of the relative coordinate x = (x; — x2 )/2 amounts 
to moving from some point representing the pair, from x to -x 
along some path. In this particular case we have in fact that 
x = xı (red curve) = —x2 (blue curve). The system has then 
moved to an indistinguishable two-particle state which means 
that the wavefunction can at most acquire a phase and we write 


Tpx) = eip(x). 


Let us now turn to Figure II.5.23 where we establish a re- 
lation between the interchange process t and the reverse 
process represented by t~! . The top-left diagram is again 
qt and the bottom-left diagram represents by definition t~! . 


Now we can do two subsequent smooth deformations of 
the path: in the top-right diagram we go from red to blue 
by just rotating around the blue axis, and in bottom-right 
diagram we go from blue to red again by rotating along the 
dark red trajectory indicated. Note that this deformation 
only involves mirror points (as is evident from the interme- 
diate dark red dashed loop), so the loop remains closed 
and the origin is circumvented as required. 


What we now learn from comparing the red path in the 
bottom-right diagram and the path corresponding to t! is 
that these two paths can be smoothly deformed into each 


Figure 11.5.23: Topological equivalence. The phase factor Tt is 
the same for all interchanges along (closed) paths that can be 
smoothly deformed into each other. So t labels a class of paths. 
In this figure we show that the class of t and t7! are actually the 
same by a sequence of smooth deformations (rotations). Note 
however that the first move from red to blue is only possible if 
the dimension of the space is D > 3. In that case t? = 1 or 
t=+l1. 


other, and therefore belong to the same class. The conclu- 
sion is that we have shown the surprising fact that t = T~! , 
in other words that t? = 1, which implies that t can only 


take the values t = +1. 


And therefore we confirmed that the quantum theory al- 
lows for only two fundamental types of particles: bosons 
with wavefunctions that are symmetric under particle inter- 
change and fermions with wavefunctions that are antisym- 
metric. 


But we also have the added restriction that the fermionic 
t = —] solution requires the exclusion principle, corre- 


sponding to removing the origin of x— space. 


Finally, let us make a crucial observation that has been 
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Figure 11.5.24: The two-dimensional case. We start with the 
plane in which we do the interchange, the origin is excluded and 
we have to identify x and —x . This lets us remove the lower half 
of the plane. Then on the bottom boundary of the remaining 
top half space we still have to identify mirror points through the 
origin. This means the space becomes topologically a cone but 
without a tip. And that is topologically equivalent to a cylinder. 


for a long time overlooked. The first ‘red to blue’ deforma- 
tion can only be performed if the dimensionality of space 
is at least three; it requires D > 3! The question that re- 
mains is: what is so special about the two-dimensional 
case? 


The two-dimensional surprise! In two dimensions the 
relative ‘x’ space is a plane with the origin taken out, and 
with opposite points identified. The paths of the particles 
are each other’s mirror image just as we see in the top 
picture of Figure IIl.5.24. So again we go one step further 
and cut away the lower half-plane. Then we don’t have 
to make any additional identifications except for points on 
the boundary. It is easy to visualize what that means. You 
can fold the half lines making up the boundary, together, 
literally by identifying the mirror points as indicated in the 
figure, and what you obtain is a cone! But: a cone without 
a tip. It is more like a tipi or an Indian tent with a hole in 


the top serving as a chimney to let the smoke out. Topo- 
logically speaking a cone without a tip is not a cone but a 
cylinder. And so, after all these topological moves we have 
shown that the space M; becomes an R? related with X, 
times a cylinder, R @ S! , for x. The important conclusion 
is that interchanges in the original two-particle space M2, 
correspond to closed loops on this cylinder. And therefore 
the question of a topological characterization of ‘identical’ 
particle types is then reduced to the question of equiv- 
alence or homotopy classes of closed loops on a cylin- 
der. 


What we see is that the situation in two dimensions is spe- 
cial indeed, because we can imagine closed paths that 
wind around one time, two times, or n times around the 
cylinder and these are all inequivalent. So there is an in- 
finity of classes which can be labeled by the set of (positive 
and negative) integers also referred to as winding numbers 
and denoted by Z. And there is even a further property, 
you can compose loops, by joining end of the first loop 
(yı) to the beginning of the second (y2), then you get a 
combined loop (y3 = Y1 - Y2). The corresponding classes 
of the loops will then add: n3 = nı + n2. 


So in two dimensions it is in principle possible to have par- 
ticles which satisfy t™ = 1 for any n, meaning that the 
phase factor of the two-particle state under interchange 
would be t = exp 27i/n. And that is why Frank Wilczek 
coined the generic name anyons for such particles be- 
cause they evidently can have any phase. 


And indeed, this observation would have the bold implica- 
tion that in two dimensions the statistics factor could be 
any rational fraction of 27, x = 27/N . By the ribbon argu- 
ment which we explain in the next subsection, this would 
also imply that the spin value should be s = 1/N . How ex- 
otic: a correspondence between fractional spin and statis- 
tics! 


Life in lower dimension is not always less interesting ap- 
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Figure 11.5.25: Feynman in discussion at the Les Houches 
Summerschool in 1979. Feynman urged students including my- 
self (who took the picture) to try and think of a simpler explana- 
tion of the exclusion principle. 


parently! That can’t be right! As a matter of fact, it is true, 
and there are states of matter on interfaces or with pla- 
nar geometries where such particles exist. For example 
as collective excitations in (quasi) two-dimensional media 
like the ‘fractional quantum Hall phases,’ that are exhibited 
by certain conductors at extremely low temperatures, as 
we will discuss in Chapter III.3. 


A historical aside. The topological nature of the par- 
ticle exchange statistics goes back to work of the Nor- 
wegian Physicists Jon Magne Leinaas and Jan Myrheim 
from 1977. They applied the very same argument we em- 
ployed in Figure II.5.22 and discovered the exceptional sit- 
uation in two-dimensions. In 1980 | published a paper 
where | constructed explicit soliton solutions that exhibited 
fractional spin as well as (non-)abelian statistics proper- 
ties. It was in the eighties that the extensions of these 
ideas took off within my own group, also guided by im- 
portant developments in condensed matter theory such as 
the work of Laughlin and Wilczek on the fractional quan- 


tum Hall effect, and string theory and topological field the- 
ories by Witten. This has lead to a quite rich research field, 
nowadays called topological order or topological matter, in 
which these exotic features are realized and | myself was 
deeply involved. This research field is expected to have im- 
portant applications in scalable and controllable quantum 
information processing and storage. And that is a good 
reason to explore these topological arguments a little fur- 
ther. It is an attractive type of physics, because it involves 
global analysis, which appeals to conceptual imagination 
rather than calculus type of skills. It’s fun when basic (or 
fancy) physics meets basic (or fancy) mathematics; it re- 
ally looks like these two fields of science are ‘convicted’ to 
each other. A marriage forced by nature on the one hand 
and a marriage de raison as the French say on the other, 
that should be a happy one. 


The spin-statistics connection 


We have in previous sections mentioned the remarkable 
connection between the fact that particles having half-inte- 
ger spin happen to be fermions while the integer spin par- 
ticles are always bosons. This spin-statistics connection 
between interchange properties and spin was not at all ob- 
vious from the start, and it only became clear once Dirac 
wrote down his famous equation for the electron and its 
anti-particle the positron that both properties were a nec- 
essary consequence of the brilliant interpretation of that 
equation given by Dirac. 


But now we understand the topological argument for the 
interchange factor from carefully looking at the two- (or 
multi-) particle configuration space as we did in the pre- 
vious section, one wonders whether there is not a more di- 
rect argument for the connection of this factor to the spin. 
There is, as we will show next, and it again turns out to 
illuminate the possibility of fractional spin for those afore- 
mentioned anyonic excitations. 
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particle anti-particle 


2m rotation 


Figure 11.5.26: Ribbon diagrams. The ribbon diagram of the 
creation and subsequent annihilation of a particle anti-particle 
pair, where the arrow indicates the direction of the charge cur- 
rent (left). The effect of rotation of a particle on the state is 
equivalent to the effect of a rotation of an antiparticle (right); the 
net effect is a change of of the vacuum state by a phase factor 
R(27). 


Ribbons. The trick is basically to realize that a particle 
with spin should be represented by a ribbon instead of a 
line. Let us imagine creating a particle anti-particle pair 
and subsequently annihilating it, then we get a diagram like 
in Figure II.5.26. We can of course also rotate the particle 
say over an angle of 27 before annihilating the pair, this 
corresponds to a full twist of the ribbon. What is demon- 
strated in the diagram on the right, is that we can move 
the twist smoothly from the particle line to the antiparti- 
cle line, which shows that their spin should equal. The 
rotation will change the phase of the two-particle wave- 
function by an angle x = 27s where s is the spin of the 
(anti-)particle. 


To demonstrate the equivalence of a rotation by 27 to an 
interchange we go to the next Figure II.5.27. There we first 
create two pairs, then we cut the two identical particle rib- 


Figure |I.5.27: Spin — statistics connection. Two pairs are cre- 
ated and annihilated, corresponding to a trivial effect on the vac- 
uum state. The pictures on the right demonstrate the topological 
equivalence of the interchange of two identical particles with a 
rotation on one of them. This implies that tH = R(27)hp) = 
+hp) , where the plus sign holds for bosons and the minus sign 
for fermions. 


bons and reconnect them to arrive at the diagram in the 
middle where the ribbons show that we interchanged the 
particles. In other words we have applied the interchange 
operator t to the wavefunction describing the middle two 
particles. As indicated in the diagram on the right, the com- 
plete exchange diagram can be smoothly deformed into 
the diagram where one of the particles is rotated over 27. 
This you can actually verify by taking a ribbon and literally 
repeat the described actions. What this says is the wave- 
function of the state is acted on by the interchange opera- 
tor t shifting the phase of the state by an angle a, but this 
phase should be equal to 27ts according to the topological 
equivalence of the two diagrams. 


So this simple argument nicely shows the topological na- 
ture of the statistics factor and of the spin-statistics con- 
nection. And who would have expected that you could give 
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# A C #|/A |B |C #/A |B |C 
1l l2 im) EEX 1 || x 
21 2 2 || xX x 2X xX 
3 1 2 3 x 3 x x 
4 || 12 4 || xx Fermions 
5 12 5 XX indistinguishable 
6 12 6 KOX exclusion 
an E2 1 Bosons = 3 states 
8| 2 1 indistinguishable 
9 2 |q > 6 states 

Marbles 
distinguishable 
> 9 states 


Table |I.5.1: State counting. Counting states for 2 identical particles that can occupy one of three states. The tables list the possible 
2 particle configurations for classical particles, bosons and fermions. 


a ‘ham handed’ experimental ‘proof’ of the spin-statistics 
connection just using two identical belts! 


Statistics: state counting 


We return to the standard setting of more conventional 
quantum theory and illustrate how indistinguishability, ex- 
clusion, and interchange properties do affect the statistical 
properties of ensembles of particles. This becomes clear 
if one starts counting the available ‘distinct’ states. 


Let us illustrate this state counting by considering a simple 
example of two identical particles labeled 1 and 2 that can 
be in either one of three states A,B and C. In the tables 
on the next page we have listed the distinct configurations 
for classical particles (‘marbles’) which are supposed to 
be distinguishable, for quantum particles that are indistin- 
guishable but do not obey the exclusion principle (bosons), 
and for quantum particles that do obey the exclusion prin- 
ciple (fermions). Because the counting of available states 


is different allowing for 9, 6 and 3 states respectively, the 
probabilities are directly affected. For example assuming 
equal probabilities for each allowed state, one may ask a 
question like: ‘What is the probability p that the two par- 
ticles sit in the same state?’ Clearly for the marbles the 
answer is p = 1/3, for the bosons p = 1/2 while for the 
fermions we have p = 0. 


For the case at hand we can define the two-particle state 
Wij(1,2) = wWill)v;(2) as a product of the states of the 
individual particles where i and j could be A, B or C. We 
can thus think of ¥;; as a 3 x 3 matrix , for the classi- 
cal states there indeed are 3 x 3 = 9 entries, for the 
bosons we have to require that the state would be sym- 
metric Y(1,2) = (2,1) corresponding to a symmetric 
matrix which indeed has 6 independent entries, while for 
fermions we have to require the state to be antisymmet- 
ric Y(1,2) = —¥(2, 1) corresponding to an antisymmet- 
ric matrix having only 3 independent entries because the 
diagonal ones have to be zero. Indeed the state vector 
YW where the fermions would be in the same state would 
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Figure 11.5.28: Bosons and fermions at T = 0. The distribu- 
tions for the two particle types at T = 0. The energy levels are 
along the vertical axis, and the occupation number is indicated 
by the number of balls. 


mean Yi; = —Yi; implying that it has to vanish, saying 
nothing less than that that there is no such state. 


These basic statistical properties of particles have pro- 
found physical consequences if we study many particle 
systems and their collective behavior. For a system in ther- 
mal equilibrium with its environment, there will be a cer- 
tain probability of a certain energy level to be occupied or 
not, which means that in a large system of many particles 
you get a distribution which tells you how many particles 
there will be on average at a certain energy level. Now 
dependent on the type of particle, these distributions are 
different, especially if one goes to low temperatures and 
low energies where the quantum behavior becomes mani- 
fest. 


What do we roughly expect to happen? Let us start with 
taking the zero temperature case, this is shown in Figure 
11.5.28. Indeed for the bosons we expect that they all con- 
gregate or better condensate in the ground state. This is 


Bosons Fermions 


Figure 11.5.29: Bosons and fermions at T > 0. Axes are the 
same as in previous figure. 


in contrast with the fermions where we expect that for N 
fermions, the lowest N states would be filled, while the 
higher states would be empty. The highest filled level is 
called the Fermi level, corresponding to the Fermi energy. 
Now if we heat the system up, particles may get excited 
to higher levels, and fall back again until a certain temper- 
ature dependent distribution over states is reached. So, 
in Figure 11.5.29 we have indicated what that looks like. 
Clearly for the fermions where all lower levels are filled 
already the thermal excitations can only take place near 
the Fermi level. Fermionic excitations create in fact also a 
hole, near the Fermi level one necessarily creates particle- 
hole pairs. 


The functional form of the three distributions can be deter- 
mined exactly, and are are depicted in Figure II.5.30 for two 
different temperatures. They have the following functional 


form: 
1 


iE) mere al 


where for m=0 we have the classical Maxwell—Boltzmann 
distribution corresponding to the blue curves, while for m = 
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+1 we have the Bose-Einstein distribution corresponding 
to the red curves, and finally for m = —1 the Fermi—Dirac 
distribution corresponding to the dark red curves. You may 
think of these distributions as function of particle state en- 
ergy, parametrized by the temperature and the chemical 
potential (Fermi energy) denoted by u. Let us make some 
observations concerning these distributions. 

i. Note that the axes in Figure I1.5.30 are labeled orthogo- 
nally to those in Figures II.5.28 and II.5.29. 

ii. Observe that for high enough energy all the distributions 
look the same for all temperatures, which is the statement 
that all particles approximately show the classical behav- 
ior. The quantum distinctions get washed away by the vio- 
lent thermal fluctuations. 

iii. Drastic differences however show up for low values of 
relevant energy scale E— u . Whereas the fermion occupa- 
tion number necessarily is smaller than or equal one, the 
boson occupation number increases rapidly if the energy 
goes to zero. In fact, if we lower the temperature to abso- 
lute zero the fermion distribution function becomes a step 
function indicating that up to the Fermi-level, all states are 
occupied (here u is the fermi-level, or the surface of the 
Dirac sea). For bosons we see that all particles will pile up 
in the same lowest energy state. 

iv. There is actually a real phase transition where a so- 
called Bose-condensation takes place where all particles 
sit in the quantum same state. This is in fact an exam- 
ple of a special macroscopic quantum state that stands 
out because of its so-called quantum coherence. Such 
states exhibit truly spectacular properties, such as super- 
fluidity, meaning that the system forms a quantum fluid 
with zero viscosity. In certain metals this can lead to the 
phenomenon of superconductivity, where the electric re- 
sistance vanishes at very low temperatures. We will return 
to these subjects in later chapters. 


You may wonder how such peculiar rules like exclusions 
and indistinguishability can be implemented in a mathe- 
matically consistent way. It turns out that to do multi-particle 
(often called many body) quantum physics, you basically 


High temperature 


Low temperature 


Figure 11.5.30: Particle distributions. The distributions for three 
particle types, giving the occupation number n(E) of a state 
at energy E — u for two temperatures. The red curves are for 
bosons, the blue ones for ‘marbles’ and the dark red ones for 
fermions. This figure is rotated 90° clockwise with respect two 
the previous figures. 


have to use the formalism of quantum fields. In this for- 
malism we have operators that can create or annihilate 
(anti)particles in any admissible energy-momentum state. 
And one finds that the different types of statistics are direct 
consequence of the basic relations between these particle 
creation and annihilation operators. For bosons we that 
the creation and annihilation operators satisfy commuta- 
tion relations meaning that 


lat, al] =0; [ax, ay] = 0 and [a,, al] = bkk’5 


where the commutator of two operators A and B is defined 
as [A, B] = AB — BA. For fermions these are replaced by 
anticommutators where the anti-commutator is defined as 
{A, B} = AB+BA. If two creation operators anti-commute 
one has in particular that 


{cl cl} =0, 


meaning that putting two particles in the same state gives 
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zero, it just can’t be done. This necessary choice of com- 
mutation or anti-commutation relations for the basic oper- 
ators is forced upon you by the requirement of a physi- 
cally consistent interpretation of the theory. That choice 
accounts for all characteristic differences between bosons 
and fermions in particular the appearance of completely 
symmetric or antisymmetric wavefunctions. 


More for less: two-dimensional exotics I 


It is like the telephone game in kindergarten. The 
children are sitting in a circle and you whisper the 
first kid a sentence in her ear, then she has to pass 
it on till it went all the way around. The last per- 
son speaks out loud what the sentence was he 
received. Then they compare the sentences, and 
share their unbelief that such distortions are pos- 
sible. That is presumably how lies emerge. This 
metamorphosis, amounts to anon contractable loop 
in language space, a nontrivial linguistic holonomy. 


The Aharonov-Bohm phase. We recall the discussion 
we had in Chapter II.3 on the Ahoronov—Bohm phase shift. 
If you carry a charge q along a loop y, around localized 
flux then the loop integral of A along y yields the magnetic 
flux through (any) two-dimensional surface hat is bounded 
by the loop. This implies that the loop operator W,, basi- 
cally measures the magnetic flux: 


We considered a well-defined narrow magnetic flux tube 
piercing through the surface as in Figure II.5.31. If we adi- 
abatically move a charge around the flux ® , the state will 
change according to, 

Iq, ®) => W, (q, ®)|q, ®) , 
where the phase factor W equals 


Wy(q, ®) = eta? . 


Figure 11.5.31: The Aharonov-Bohm phase factor. If we carry 
a charge q along a loop y around a localized magnetic flux ® , 
then the state will acquire a phase factor Wy = expiq®. 


An important property of this phase is that it is not only 
gauge invariant but also topologically invariant, meaning 
that you can deform the loop any way you want as long as 
you don’t cross the flux. 


Anyons as flux-charge composites. Let us return to our 
discussion about two-dimensional particles and their spin 
and statistics properties. Let us look once more at Fig- 
ure II.3.33 but in a different way. | now think of the charge 
and flux as one composite object. The situation is like 
in Figure II.5.32, where we look from far away and do not 
worry about the (internal) structure of the pair. The inter- 
pretation of the figure is then that we rotate the compos- 
ite over an angle of 271, and we see that the state of this 
funny particle has changed by W(q, ©) . This means that 
our conclusion has to be that the composite must carry 
some spin s , which causes the non-trivial phase factor of 
the state under rotation by 27. By definition for a particle 
carrying spin s , the corresponding factor is given by, 


e2its = eid? > s= q® . 
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Figure 11.5.32: Flux-charge composite. We think of the charge 
flux pair as a composite particle. Then the electromagnetic 
phase factor can be interpreted as due to a ‘(fractional) spin’ 
s of the composite. 


For example, if in the superconducting layer, a single elec- 
tron would bind with a minimal flux (do = 7/e) we would 
have s = edo/27m = 1/2, this would be a spin-half com- 
posite particle! 


The spin-statistics connection for composites. We ar- 
gued that the composites can have fractional spins de- 
pending on which fluxes and charges are allowed. But is 
it also true that they would exhibit the corresponding ex- 
change properties? Can we establish a spin-statistics con- 
nection using the ribbon diagrams of Figure II.5.27 ? Let us 
start with the phase factor of two composites as in Figure 
11.5.33. The combined state after a full rotation would ob- 
tain a phase factor of twice W(q, ®) , because the charge 
qi would encircle the flux ®2 and at the same time q2 
the flux ©, , giving us 2qọ as the fluxes and charges are 
equal. So we have to take the square root, as we only 
want to do the interchange, so we do get indeed the same 
result as the spin factor.‘ This way we have established 


“The possible extra minus sign from taking the square root cannot 


Figure 11.5.33: Interchange statistics of composites. For the 
composite particle it follows that the spin- statistics connection 
holds. 


the exotic spin and statistics properties that are possible in 
two dimensions. 


A historical aside. These particles are called anyons, 
a name coined by Frank Wilczek, because they can ac- 
quire any phase upon rotation or interchange. These so- 
called quantum-Hall systems were discovered by the Ger- 
man physicist von Klaus von Klitzing, and the fractional 
version of it by St6rmer and Tsui. The theory of this phe- 
nomenon involving the fractionally charged anyons with 
fractional spin along the lines we just pointed out was de- 
veloped by the Americans Robert Laughlin who shared 
the Nobel prize with Störmer and Tsui in 1988, and Frank 
Wilczek who already had received a Nobel prize for the 
theory of the strong interactions. 


There are now many proposals for phases of condensed 
matter that feature these local anyonic excitations. Such 
phases share a property called topological order. |t was 


be resolved at this level of the analysis. Note however that it allows for 
implementing that the constituents would be a fermion to start with. 
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the Russian theoretical physicist Alexei Kitaev who poin- 
ted out that such anyons would be ideally suited to build 
quantum information devices with, because anyonic qubits 
are intrinsically fault tolerant. This highly desirable prop- 
erty derives from the topological nature of the quantum 
phases, which makes that these cannot be destroyed by 
local interactions and such error generating effects would 
be exponentially suppressed. One may manipulate the 
phases on multi-anyonic, multi-qubit states by just mov- 
ing them around each other, or as it is called by ‘braiding’ 
them. Because of their topological nature computations 
with anyons would correspond to particular braids or knots 
of their world lines. And computation would boil down to 
some kind of quantum knitting! E 
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Chapter II.6 


Symmetries and their breaking 


Symmetry, as wide or as narrow as you may define 
it, is one idea by which man through the ages has 
tried to comprehend and create order, beauty and 
perfection. 

Hermann Weyl 


Symmetries play and have played a crucial role in the de- 
velooment of the modern physical sciences. It is a rich 
subject and its manifestations are quite diverse and dis- 
play remarkable analytical and esthetical aspects. Central 
to this topic are the mathematical notions of a Lie group 
and a Lie algebra. In the quantum context these symme- 
tries are implemented by certain sets of operators (observ- 
ables) that act on the Hilbert space of the system. We have 
encountered them already as they arise naturally at many 
levels in the framework of quantum theory. The connec- 
tions between formal mathematical and physical concepts 
are summarized in the table on page 447, and | recom- 
mend that you regularly consult the table while reading this 
chapter. 


In this chapter we have split the applications between the 
well-known ‘ordinary’, rigid, or global symmetries and the 
so-called gauge or hidden or local symmetries. The for- 
mer are like the familiar translations or rotations, or isospin 
transformations, while the latter refer to the internal sym- 
metries that are tied in with the fundamental interactions. 
Electrodynamics is a simple example of a gauge theory, 


and we have already discussed itsgauge symmetry already 
in Chapterl.1. Gauge symmetries are especially powerful 
because they are restrictive in the sense that they impose 
the way particles can interact in a consistent way. The 
dynamical equations underlying the Standard model are 
pretty much an expression of this principle of local gauge 
invariance. The mathematical concepts are those of dif- 
ferential geometry and the theory of fiber bundles, as we 
pointed out in the section on the ‘Physics of geometry’ of 
Chapter 1.2 


After the discussion of symmetries themselves, we move 
on to talk about breaking the symmetries. Symmetry break- 
ing is another powerful concept that has found a rich va- 
riety of applications in fundamental physics on all scales, 
from say the cosmos all the way down to the phenomena of 
ferromagnetism in condensed matter or the Higgs mecha- 
nism in particle physics. 

Symmetry breaking encompasses a hierarchical perspec- 
tive on the increasing diversity and complexity we observe 
in nature as a hierarchical pattern resulting from a se- 
quence of symmetry breaking transitions. We will discuss 
examples of the breaking of global as well as local sym- 
metries. 

Symmetry and its breaking are deep and delightful sub- 
jects that teach us about the mathematical intricacies of 
fundamental interactions and their structural beauty. 
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Let me start this chapter by stepping back and revisiting 
some statements | have made along the winding road we 
have taken so far, and looking at them again from the point 
of view of symmetry. Symmetry pops up everywhere and 
that indicates that there are many entries into this quantes- 
sential subject. Whereas symmetry leads to unity, similar- 
ity, and degeneracy, breaking symmetries does the oppo- 
site, it is a mechanism explaining how symmetry can get 
lost. The mechanism is quite generic and it is therefore 
important to understand its systemic signatures. 


Nature started from a highly degenerate situation at a very 
high temperature (energy) and then created (evolved) di- 
versity by going through a series of symmetry breaking 
transitions that took place when the ambient energy or 
temperature lowered. In an expanding universe like ours 
the loss of symmetry is as natural as it is inescapable. 


By changing a circle into an ellipse and then to an arbitrary 
closed curve, one goes from a symmetry of continuous ro- 
tations in the plane, to two mirror symmetries, to no sym- 
metry at all. It is a sequence of ever more symmetries be- 
ing broken. Note however that from an information point of 
view, the information content increases with decreasing (or 
the breaking of) symmetry. Indeed you move from a curva- 
ture along the closed curve that is constant, to a curvature 
that is a periodic to a random function, and the amount of 
data you need to describe them increases. 


Too much symmetry is boring because it is extremely re- 
dundant and predictable, but the same holds for too much 
randomness because of an extreme lack of structure. Ex- 
citement and beauty apparently reside halfway in between, 
and that is maybe why nature has chosen a path of break- 
ing more and more symmetries. At present we encounter 
remnants of lost symmetries like subtle and hidden mem- 
ories. But that is what makes nature so interesting. Life 
as an ‘avenue of broken symmetries’ so to speak. It allows 
science to gain a deeper and more unified understanding 
of the hidden patterns underlying reality. 


Symmetries of what? 


The symmetries that are important in physics, are 
not the symmetries of things but the symmetries of 
equations. 

Steven Weinberg 


We think of a group of symmetries as a set of operations 
or transformations that leave something invariant. This can 
be an object like a triangle or a sphere, and we speak of 
the ‘symmetries of objects’, and this is certainly its most 
familiar manifestation. We may also think of the symme- 
tries of spaces, these are transformations on the space, 
meaning transformations of the coordinates in such a way 
that the properties of that space do not change. For ex- 
ample flat space R? has a huge group of symmetries: we 
can translate it over an arbitrary distance in any direction, 
we can rotate it around any axis through any point over 
any angle, and we can scale it by any amount around any 
point. With an infinite flat space you wouldn't see the dif- 
ference, it is invariant under all those transformations and 
combinations of them. And besides that it has also dis- 
crete mirror symmetries, a transformation called parity. It 
makes you wonder whether it is this incredible overkill of 
symmetry that makes flat space so boring. 


Yet another, and in physics crucial, application is to study 
not so much the symmetries of things, but rather the sym- 
metries of equations, which means again that we make a 
transformation on the dynamical variables that leave the 
(system of) equations invariant. 


Realizations of symmetry in nature. People | trust have 
told me that the Inuits have 32 words for snow, and that 
presumably is because they know a lot more about it than | 
do. By living in the snow for centuries they have learned to 
differentiate and appreciate an immense diversity in some- 
thing that | just call ‘snow. Something similar has hap- 
pened with the notion of symmetry in physics and its mirror 
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images in mathematics. 


With all these different approaches comes a correspond- 
ingly rich terminology referring to what we are precisely 
talking about. One speaks of discrete versus continuous, 
finite versus infinite, space-time versus internal, local ver- 
sus global, broken versus unbroken, approximate versus 
exact, normal versus super, classical versus quantum sym- 
metries. This summary suffices to justify a chapter on this 
topic, a chapter in which | will guide you through some of 
this extensive jargon in a way that emphasizes the basic 
concepts. 


Groups, algebras and their representations. The frame- 
work for the following discussions on symmetry is summa- 
rized in the table on page 447, and it shows that in the 
class of continuous symmetries the mathematics is mostly 
that of Lie groups and algebras. These are quite abstract, 
mathematically precisely defined objects themselves, but 
the beauty is that it comes with an important part denoted 
as representation theory. Physicists perceive the notion 
of symmetry mostly through the particular representations 
that are manifest in nature. Let me recall the observables 
{X, Y, Z}, the Pauli matrices, and the fact that their com- 
mutation relations form the non-commutative Lie algebra 
denoted as su(2).! It is called the ‘defining’ representation 
of this algebra because it is in the form of 2 x 2 hermi- 
tian matrices, working on a two-dimensional complex vec- 
tor space — the state space of a single qubit. But exactly 
the same algebra, meaning an identical set of commuta- 
tion relations, is obeyed by the angular momentum oper- 
ators {L,, Ly, Lz}. That is a different representation of the 
same algebra in terms of differential operators working on 
a space of functions — the Hilbert space, quite different 
from 2 x 2 matrices but satisfying the same algebra. If 
we furthermore restrict to states of a given angular mo- 
mentum l, (think of the hydrogen atom) then these form 


'To be precise, it is one-half times the Pauli matrices that satisfy 
the su(2) algebra. Commutation relations are nonlinear so the scale is 
exactly fixed. This factor one-half turns out to be important. 


a (2l + 1)-dimensional vector space and the rotations are 
then generated by a specific set of three (21+ 1) x (214+ 1) 
hermitian matrices. And all these sets form inequivalent 
representations of the same algebra, labeled by the quan- 
tum number l. We will be somewhat cavalier about mak- 
ing distinctions between the abstract notions of an algebra 
or group and their representations. In physics we mostly 
work within the context of particular, often unitary, repre- 
sentations. You may think of representation theory as the 
physical contextualization of abstract group theory. 


Symmetries and conserved quantities 


Heisenberg equations. | choose a route that starts with 
symmetries of a Hamiltonian (operator), leading from there 
to the notion of conserved quantities, and from there to 
frameworks for labeling the energy eigenstates of that Ha- 
miltonian. Let me start from the basic Heisenberg equa- 
tions which apply to quantum systems on all levels: 

_ dA 


ih =[A,HI. 


7 (1.6.1) 


Remember that in this formulation the dynamical variables 
or observables are time dependent, and in that sense the 
Heisenberg approach is closer to the classical one, be- 
cause it is formulated in terms of the observable quantities 
only.” This in contrast with the Schrödinger equation which 
describes the time evolution of quantum states, and those 
are not directly observable. 


Symmetries and conservation laws. The equation says 
that the time evolution of the system is generated by the 
Hamiltonian H. In particular, an infinitesimal change in 
time, corresponding to acting with iid/dt on the variable, 
is equal to taking the commutator of that variable with the 


?Note the similarity between the Heisenberg equations and the Pois- 
son equations discussed in the section on classical mechanics of Chap- 
terl.1. 
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Figure II.6.1: The quantessence of symmetry. If an observable 
Q commutes with the Hamiltonian, then it is conserved in time, 
and generates a symmetry of the system. 


Hamiltonian. Consider now an observable Q; which com- 
mutes with the Hamiltonian or energy operator, so: 


[(Qi,H}=0 => 


The equation teaches us that observables that have a van- 
ishing commutator with the Hamiltonian do not change in 
time. They are constants of the motion and are conserved 
in time. It means that if the dynamics of the system fol- 
lows the Heisenberg evolution equations, and we start with 
a state corresponding to a certain (eigen)value q; for the 
observable Q;, that the evolution will take place in a sub- 
space of the Hilbert space labeled by that eigenvalue, and 
by some eigenvalue E of the Hamiltonian as well, because 
the energy operator H is (by definition) a conserved quan- 
tity. Everybody commutes with themselves after all. 


This reasoning leads to an interesting picture: we have a 
system characterized by a set of basic variables (think of 
position and momentum) and a huge set of derived ob- 
servables (like energy or angular momentum), and these 


observables form a closed operator algebra under com- 
mutation. In the Math Excursion on vectors and matrices 
on page 632 of Volume III we explain that these algebras 
of observables that close under commutation are in mathe- 
matics referred to as Lie algebras. We present an overview 
of the relation between mathematical and physical aspects 
of symmetry in the table on page 447. 


Lie algebra of observables. What we say is that such 
a Lie algebra is a rather abstract thing, but it has repre- 
sentations in the form of matrices or differential operators. 
This we saw for example with the algebra of the canonical 
variables X and P, which reads: 


[X,P] =ih => X-x and Pons, 

and therefore has a representation where X is represented 
by the ordinary number variable x (like it appears as argu- 
ment of the wave function). Acting with X on a wavefunc- 
tion p(x) means multiplying that wavefunction with x. P is 
represented by the differential operator as indicated in the 
equation above. It is the infinitesimal displacement oper- 
ator. This was worked out in the section on position and 
momentum operators on page 387. 


Translation invariance and momentum conservation. 
Let us explore this a little further along the lines of en- 
ergy conservation for the simple mechanical system that 
we discussed in the section on Newtonian mechanics in 
Chapter 1.1. If we consider the energy of a particle then 
that usually consists of a kinetic part P?/2m and a poten- 
tial part U(X). Suppose that we make the additional as- 
sumption that the potential energy is constant and does 
not depend on X, then the canonical commutation rela- 
tions above imply that [P, H] = 0 and hence the momen- 
tum is conserved. In the classical argument one would 
normally say that the force F(x) = —dU/dx = 0 and New- 
ton’s second law then tells us that dp/dt = F = 0, leading 
to the same conclusion. 


We encountered this situation for example in the section 
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about the ‘free particle on a circle’ of Chapter II.5 where 
we found that states were labeled by the quantized mo- 
mentum p = hk (k-integer), being a conserved quantum 
number. So we chose a framework consisting of the en- 
ergy and the momentum operator, with as sampling space 
just the momentum eigenvalues —oo < p < +00 . Here we 
see that if an underlying space-time symmetry, like trans- 
lation invariance, is also present in the Hamiltonian, then 
indeed, the spectrum reflects that. But there is always a 
dual aspect. On the one hand the momentum P which is 
the conserved quantity, but on the other that very same P 
is the generator of the symmetry transformations being the 
translations. We have illustrated this general relationship 
in Figure II.6.1. 


Rotations and angular momentum conservation. Let 
us now consider a more complicated example where sym- 
metry tells us a lot about the spectrum, the case of the 
Hydrogen atom. The spectrum exhibited a large degener- 
acy which explained and depicted already in Chapter 1.4 in 
Figure 1.4.9. The states are labeled by three integer-valued 
quantum numbers: the energy related quantum number 
n = 1,2,..., the angular momentum quantum number 
l = 0,1,...,nm — 1 and the magnetic quantum number 
—l < m < l. In this problem we have a spherically sym- 
metric electric force field centered at the nucleus in the ori- 
gin. The energy consists of two parts, a kinetic part p*/2m 
and a potential part —k/|x| and each part depends only on 
the length of the vectors and therefore is invariant under 
rotations. So we expect that the generators of rotations 
commute with the Hamiltonian and that they are there- 
fore conserved, and somehow their sample spaces should 
be reflected in the labeling of the degenerate states with 
equal energy. Indeed, the generators of those rotations 
around the x, y, and z axes are the corresponding angu- 
lar momentum observables/operators defined as a vector 
L: 


L=XxP. 


Furthermore, the three components are conserved, as one 


can indeed show: 
(H,L,]J=0 i=1,2,3. 


But now a further complication pops up: the conserved 
components of L do not commute among each other. We 
have: 


[L},L2] = ihL3, and cyclic permutations. (11.6.2) 


This algebra of real three-dimensional rotations, denoted 
as so(3) happens to be identical to the by now familiar 
su(2) Lie algebra. To describe the system we need to 
choose a framework F, which means that we have to 
choose a subset of mutually commuting operators. Con- 
ventionally one chooses the following set: H, L? = L} + 
5 + L4 and L; with the eigenvalues: 


E 
H |rtm) = <> lPnim) ; 
Lê pnm) = WUL+1) Phm); 
L3 Pum) = Am hbatm)- (II.6.3) 


And as we mentioned before, for a fixed value of the prin- 
cipal quantum number n, there are in fact 2n? degener- 
ate states as a consequence of the symmetries that are 
present in the problem. The set of those states form a ba- 
sis for all allowed states with an energy corresponding to 
that value of n. If we take n = 3, we should have l = 0, 
l= 1 and l = 2, but the symmetry algebra so(3) given in 
(11.6.2) does not change the value of l, only the values of 
m from —l to +l, which means that the rotational symme- 
try only accounts for the (2l + 1)-fold degeneracy for each 
value of l. The conclusion therefore is that for n = 3, 
the spectrum consists of the three distinct irreducible rep- 
resentations of the rotation group (labeled by l = 0, 1, 2), 
see also Figure 11.6.2. That suggests that there is may be 
more symmetry present in this problem, a topic we will re- 
turn to shortly. 


Let us make another observation here. In the choice of the 
framework we at once introduced the operator L? , which is 
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Weight lattice 


Root diagram 


su(2) = so(3) 


Figure II.6.2: The representations of su(2) ~ so(3). The group 
SU(2) has three generators that form the algebra su(2). The 
root diagram has the diagonal L3 which forms the Cartan sub- 
algebra #H , while the arrows represent the raising and lowering 
operators Ex ~ L+ . The weights of all (unitary) representations 


are on the weight lattice. We furthermore depicted the weight 
diagrams of various irreducible representations labeled by suc- 
cessively l = 0,1/2,1,3/2,.... 


strictly speaking not part of the Lie algebra. It is a quadratic 
combination of generators that has the nice property that 
it commutes with all of the su(2) ~ so(3) generators: 
[L2 , A] = 0. Such invariant polynomials (also called Casi- 
mir operators or Racah invariants) play an important role 
in Lie algebra theory because you can use them to label or 
identify the inequivalent representations. And indeed the 
eigenvalue l(l + 1) (or for that matter 1) labels and distin- 
guishes the infinitely many different (irreducible) represen- 
tations of the algebra by (2L+1) x (21+ 1) matrices. 


Vectors and spinors. Let us return to the abstract alge- 
bra (II.6.2) of so(3). We have mentioned that this alge- 
bra is identical to the algebra su(2) generated by (a half 
times) the Pauli matrices X, Y, and Z . And this implies that 
the algebra not only has integer l representations, but also 


half-integral, so-called spinor, representations. And as you 
see these do not show up in the orbital angular momentum 
part, but in the part associated with the spin of a particle, 
which is a degree of freedom that is not present at the clas- 
sical level. Actually saying that there is no classical equiv- 
alent is of course not correct. We have shown that the 
classical system underlying the spin-half, quantum degree 
of freedom, is just the classical two-state system of a bit or 
Ising spin. Not much ‘rotational’ about it and that is what 
is implied by saying that it has no classical analogue. But 
if you ‘believe’ the mathematics, the half-integral represen- 
tations had to be there somewhere, and yes they showed 
up in the anomalous Zeeman-effect that brought Uhlen- 
beck and Goudsmit in 1925 to their bold conjecture of the 
‘intrinsic spin’ of the electron, and 5 years later became 
a compulsory ingredient of any particle obeying the Dirac 
equation. This we discussed already in Chapter II.1. 


So what we learned from these examples is that the Lie 
algebra so(3) which happens to be the same as su(2) 
has an infinity of inequivalent (unitary) representations la- 
beled by an integer or half-integer quantum number j = 
0, I, 1,... and that that representation can be realized by 
(2) + 1) x (2) + 1) hermitian matrices. There is a ba- 
sic distinction between the integer and half-integer eigen- 
value representations: physicists refer to the integer ones 
as vector representations and to the half-integer ones as 
spinor representations. In the hydrogen atom we saw all 
the representations showing up, in the discussions we had 
on the qubit we start off with a single spin one-half (dou- 
blet) representation, but as we mentioned before in the n- 
qubit space we have a much bigger symmetry group act- 
ing corresponding to SU(2™) , which contains the product 
group of n individual SU(2) as a subgroup. 


An additional dynamical symmetry. Let us return to the 
spectrum of hydrogen and note that there is still something 
we haven't explained. The degeneracy observed at energy 
level n equals 2n?. It involves a degeneracy of different 
l representations, which cannot be accounted for by the 
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Figure II.6.3: The Runge—Lenz vector, A = P x L — mkĉ is 
an additional conserved quantity in the problem with a central 
-k/r potential, where the origin of the coordinate r is in one of 
the focal points. A points always parallel to the long axis of the 
ellipse in the direction of the ‘perihelion. 


rotational symmetry. It could be ‘accidental, but that would 
be hard to believe if you have stayed with me so far. You 
would probably bet that it must be the consequence of yet 
another symmetry that we still have to disclose and that 
would make the whole picture even more striking. 


Indeed, that symmetry is there, as there are in fact three 
more (independent) observables that commute with the 
Hamiltonian if it has a central 1/r potential. The presence 
of this symmetry is directly linked to the particular form of 
the interaction potential and is therefore called a dynami- 
cal symmetry. The generators form a vector just like the 
angular momentum and that vector is called the Runge- 
Lenz vector after its (re)discoverers.? This vector usually 
denoted by A is defined as: 


A=PxL—mkx. (1.6.4) 


3]t has an interesting history with many rediscoveries going back to 
the early 18th century. Pauli was the first to use it to solve the hydrogen 
atom in an article from 1926. 


We have constructed A at various points of a classical 
Newtonian elliptic orbit in Figure 11.6.3, and we see that 
it is indeed a constant of the motion. Note that it takes 
some use of the ‘like-rule’ to get the orientation right and 
then you see that the vector is parallel to the long axis of 
the ellipse and points in the direction of the ‘perihelion. 
It is surprising that such a conserved vector-like quantity 
exists, but you expect on the quantum level to be respon- 
sible for the extra degeneracy with respect to the quantum 
number l = 0,1,...,n—1. 


That explains by the way that in the Newtonian theory the 
elliptic orbit is completely fixed in space, and moreover it 
also explains that this feature disappears if we add a cor- 
rection term coming from Einstein’s general theory of rel- 
ativity. That term concerns a small 1/r° contribution, that 
breaks the symmetry and therefore the ellipse is no longer 
fixed in space and starts rotating in the plane of the orbit. 
This is the well-known ‘perihelion precession’ that was ob- 
served for the planet nearest to the sun Mercury already 
in the nineteenth century, and could indeed be accounted 
for by Einstein’s theory. It illustrates the notion of an ap- 
proximate symmetry it is not an exact symmetry but nev- 
ertheless teaches us about essential features of the sys- 
tem. 


The full symmetry of the hydrogen atom 


After all this struggling with vector products you may like 
to know what the total symmetry algebra of the hydrogen 
atom really is. This algebra is six-dimensional, and is in- 
deed generated by the three L and the three A compo- 
nents. They form a closed algebra and it is in fact the 
algebra so(4) of the rotations in four dimensions. So here 
we are, we set up a problem in three dimensions and now 
we get a spectrum exhibiting a manifest so(4) symme- 
try. It underscores that the algebra has many represen- 
tations and these may show up in all kinds of contexts 
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which have nothing whatsoever to do with a physical four- 
dimensional space. Here it surfaced because besides the 
rather evident spatial rotational symmetry of the problem, 
there turned out to be the additional, somewhat hidden 
dynamical symmetry (dynamical because it depends on 
the particular 1/r behavior of the potential and not on the 
underlying space). Including that symmetry allowed us 
to fully resolve the degeneracies in the hydrogen spec- 
trum. 


Raising and lowering operators. We see that we have 
chosen a consistent framework F = {H, L?, Lz} to label the 
states. They are mutually commuting, but now you may 
ask what happened to the other symmetry operators — Lx 
and L, for example — that commute with the Hamiltonian 
but not with L,. We basically know what their meaning 
is as we showed before that they can be regrouped into 
raising and lowering operators that step up and down the 
different m values (within a single l representation). And 
similarly the components of the Runge—Lenz vector can 
be used to step up or down the value l of the total orbital 
angular momentum. So in this case these are operators 
that make steps not in energy but rather in other quantum 
numbers that label the degenerate states. 


So if we go to the table on page 447, we see that a frame- 
work F typically involves a set of rank A operators form- 
ing a so-called Cartan subalgebra H of A.A Cartan sub- 
algebra consists by definition of a maximal set of mutually 
commuting generators of A . And indeed the other genera- 
tors in A—H can be regrouped in a complete set of raising 
and lowering operators. 


A full set of step and symmetry operators satisfying equa- 
tion (II.5.21) is called the spectrum generating algebra for 
the obvious reason that they allow you to walk through 
the sample space, in principle finding all the energy eigen- 
states and their quantum numbers referring to a framework 
compatible with the energy operator. 


Generating the spectrum (sample space). Let us as- 
sume that by some means we succeeded in construct- 
ing a complete set of step operators which bring you from 
one energy level to another, one could in principle imagine 
looking for the ground state(s) (the state(s) that are ‘anni- 
hilated’ by all the lowering operators) and then, using the 
spectrum generating algebra of all step and symmetry op- 
erators, to generate the whole spectrum of eigenstates of 
the Hamiltonian. 


We have seen that symmetries, and in particular the max- 
imal set of mutually commuting symmetry operators, yield 
the set of quantum numbers that allows us to label and 
distinguish a relevant basis for all states. And as the labels 
of such base states corresponds to eigenvalues of sym- 
metry operators they are conserved in time. Therefore, in 
a general sense, such a maximal set allows us to ‘name’ 
the properties of the system, since ‘names’ are useful pre- 
cisely because they do not change all the time. On the 
other hand if the system undergoes interactions, the prop- 
erties may change and also then it is important to have a 
proper identification of property names or quantum num- 
bers. For example, the interaction may excite the system 
and therefore basically act like a raising operator. 


Symmetry algebra and symmetry group 


So far we have talked about the observables Q; that com- 
mute with the Hamiltonian. They are conserved and we 
have seen that they generate a symmetry. That means that 
acting with them gives an infinitesimal displacement corre- 
sponding to a tiny symmetry transformation. This applies 
of course only to the case of continuous symmetries. You 
might wonder what a finite transformation then would look 
like and how they are described. It is here that we have 
to move from the mathematical concept of a (Lie)-algebra 
to that of a Lie group. This question is briefly addressed 
in the Math Excursion on Vectors and Matrices on page 
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Figure 1I.6.4: The group manifold of SU(2). 
represented as a solid three-dimensional ball with radius 271. A 
point in that space corresponds with a rotation around the vec- 
tor by an amount that corresponds to its length. All points on the 
surface are identified and represent minus the identity element: 
a rotation of about any axis by 27 yields an overall phase minus 
one. 


SU(2) can be 


635, and we have used in Chapter II.3 in the section on 
the Berry phase on page 347. 


Exponentiation of the algebra. Let us return to the ques- 
tion of frame rotations for a qubit corresponding to a two- 
dimensional (complex) vector . 


We considered the Z-frame and the X-frame and these 
frames are clearly related to each other by a finite rotation 
over an angle of 45° around the y-axis (perpendicular to 
the z— and x—axes). Let us make an angle rotation over 
an angle 0 around the Y axis* and use the matrix version 


Here the factor a half comes back and becomes relevant. The pa- 
rameter is 8, but the generator satisfying the su(2) commutation rela- 
tions is Y/2, and therefore it looks like a rotation by 0/2 , but it is not. 


of the Euler identity: 


e®Y/2 = 1cos 0/2 + iY sin 0/2; 


cos0/2 sin0/2 


A ( —sinð/2 cos0/2 ) = ALB) 


Let us apply this to see what it does with the basis vec- 
tors: 


cos0/2 sin0/2 Tey, cos 0/2 
( —sin@/2 cos0/2 ) ( 0 ) T ( —sin 0/2 ) l 
If we put 8 = 90° , we get exactly the finite rotation of state 
| + 1) to |—) as indicated in Figure ll.2.1 where the frame 
choices are discussed and how these choices are related 
to the unitary group transformations we denoted as U in 
our discussion in Chapter Il.1. We also know how to apply 


this transformation to the operators, We have to act from 
both sides for example: 


Z — R,(8) Z Ry(—9) =—X, 


where we have used the fact that Ry (0/2)" = R,(—9/2). 
This explicitly resolves a puzzle that you may have felt un- 
easy about. The algebra is three-dimensional with X/2, Y/2 
and Z/2 as basis vectors, and indeed by rotating Z around 
the Y axis with © = 90° yields —X , exactly as you would 
expect, but applying the same transformation to the qubit 
rotates the two-dimensional ‘vector’ only over 45 degrees. 
How is that possible? Well to be precise the qubit is not a 
vector in the usual sense it is therefore that we introduced 
the term spinor exactly to make this distinction. 


From the above considerations one may show that any fi- 
nite SU(2) group transformation can be parametrized as 


i) "Ta 


gy) =e ¢ with {Ta} = {X/2, Y/2, Z/2}. 


Finite translations. For the translations one can do a sim- 
ilar exponentiation, 


T(a) = eP , (11.6.6) 
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which gives that on an operator which depends on X , and 
P we obtain after a finite translation by any amount a: 


f(X, P) > T(a)f(X, P)T(—a) 


= f(X, P)e >P = f(X+a,P). (1.6.7) 
In particular one has the property that T(a)XT(—a) = 
X + a., showing that the X operator has been shifted by 
a. 


In the same vein you can show that if [H, P] = 0 and P is 
conserved. That also means that 


T(a)HT(—a) =H, 


which literally says that it leaves the Hamiltonian invari- 
ant, i.e. the translations are a symmetry of the Hamilton- 
ian. 


What | am trying to make plausible is that by ‘exponen- 
tiating the algebra’ we do get the corresponding group. 
Whereas the algebra describes infinitesimal transforma- 
tions you need the group to do finite transformations. And 
whereas the algebra is a linear vector space, the group is 
some smooth curved manifold. 


The group space or manifold of SU(2). You can think 
of a group as a smooth manifold or space. For example, 
the group U(1) is just a circle as we mentioned before. 
For the real space translations it is R? because a finite 
translation in space is fixed by the three components of 
the displacement vector. 


The group SU(2) is isomorphic to the three-sphere S? as 
we discussed in Chapter Il.1 on page 254. So exponenti- 
ating the su(2) algebra (note the use of lowercase) we get 
the SU(2) group (in capitals). The su(2) algebra has gen- 
erators X,Y,and Z, and is therefore three-dimensional. 
The dimensionality of the algebra is the same as that of 
the group (manifold). The group SU(2) has therefore three 
independent parameters, or coordinates. You can think 


e 


SU(2) = $? 


R4 


Figure 11.6.5: The group SU(2) and its algebra su(2). SU(2) 
can also be represented as a unit three-sphere S? embedded in 
R*. The su(2) algebra can then be thought of as the R? tangent 
space to the group manifold in the origin (the point correspond- 
ing to the trivial or unit element e). 


of the algebra as the tangent (hyper) plane to the group 
manifold in the unit element e (corresponding to the trivial 
transformation). That plane has of course the same di- 
mension but is a linear, flat space like IR" . In Figure II.6.5 
we give illustrated this relation between the SU(2) group 
and the su(2) algebra. If you stay near the unit element, a 
change in the tangent plane is almost as good as moving 
on the group manifold. It’s like assuming that the Earth is 
flat, which is not such a bad approximation if you look on 
the scale of kilometers, but causes serious trouble if you 
start thinking in terms of thousands of kilometers! Think- 
ing locally amounts to making a linear approximation, as 
for small a ~ e we may write 


T(e) ~ 1 +ieP. 


This terminology is that the algebra generates infinitesimal 
transformations. In short: thinking local acting global is 
bad, while thinking global and acting local is fine. 
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Gauge symmetries 


We have argued that the equations that form the starting 
point for quantum fields are basically the same equations 
that one can write down for classical fields. Those classical 
fields change from being just functions on the configuration 
space to operator valued fields. And these then have to 
be quantized typically using canonical methods where the 
fields become like ‘field coordinates’ and their derivatives 
like ‘field momenta’. 


Electrodynamics revisited. Let us go back to the Schré- 
dinger or better the Dirac equation in three plus one dimen- 
sions and ask how we could implement the interactions 
with the electromagnetic field. Somewhere in the equa- 
tions there ought to appear terms that describe this inter- 
action. Now we go through a beautiful argument where 
you will see how a number of rather peripheral remarks 
we have been making before all fall into place and yield a 
profound insight. That insight amounts to the fact that na- 
ture has a hidden symmetry and that imposing that sym- 
metry completely fixes the precise form of the interactions 
(fundamental forces) between the elementary constituent 
particles. 


| give the argument in relativistic notation, because that 
keeps things simple and elegant. The argument also holds 
true in non-relativistic situations. We want to use space- 
time vectors that have four components: for example in- 
stead of using the usual momentum vector p we switch 
to the four-momentum written pu where u = 0,...,3 and 
the time component of the four momentum is defined as 
Po = E/c. Now if you look at the equations describing the 
interaction of charged particles with the electromagnetic 
field, then it turns out that you can get those interaction ex- 
actly right if you use a simple trick that goes by the name 
of ‘minimal substitution’. It is a recipe that says: for a parti- 
cle with a charge e replace everywhere the momentum p, 
by py + eA, . The four vector A,, = (V, A) are the electro- 


magnetic potentials where V is the electrostatic or scalar 
potential and A as the vector potential. 


These were introduced in the section on electrodynam- 
ics in Chapterl.1, together with the electromagnetic field 
strength Fv: 


Fuy = OpAy = OvAu . (11.6.8) 


The three spatial components Fij correspond with the com- 
ponents of B , and the space-time components Fo; corre- 
spond with the components of E. 


Gauge invariance. In Chapter I.1 we argued that there 
is some redundancy in keeping all the six components of 
the fields E and B and one could do with only the four 
components of the gauge potential A, . That is indeed the 
case but as a matter of fact even that doesn’t eliminate all 
redundancy. In the formulation with the gauge potentials 
there is still some redundancy left, because we can make a 
transformation on the gauge potentials that leave the field 
strength F and thus the physical E and B fields invariant. 
This transformation is called a gauge transformation and 
involves a space-time dependent function A(x, t): 


A(x, t) > A041) = Au + opAly, t). (11.6.9) 
If you substitute the transformed field into (11.6.8), you im- 
mediately see that the extra terms cancel each other out, 
and that proves the invariance (and the efficiency of the 
relativistic notation). 


This invariance is of another type than we have been dis- 
cussing before, because the transformation depends on 
space-time. Itis called a /ocal transformation because by 
choosing the transformation you fix the amount by which 
you transform in every point independently, as long as it 
changes smoothly from one space-time point to the next. 
This means that we are effectively dealing with only three 
components for the gauge potential, because one may 
choose the gauge function in such a way as to ‘gauge 
away’ one of the components of the gauge potential. So 
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why don’t we get rid of it you may say, and strip the descrip- 
tion of the electromagnetic field to the bare minimum. This 
is not so easy and you could say that keeping the redun- 
dancy is the price we pay for the transparency and com- 
pactness of the theory, and most importantly its linearity. 
This theory is beautiful like a peacock, with the exceptional 
property that it can fly as well! We are a bit like dealers in 
options when we talk about the field strengths which cor- 
respond to the invariant physical degrees of freedom, but 
these are in fact derivatives of the underlying potentials to 
which the particles couple. 


Covariant derivative. The minimal substitution means 
that for charged particles we change the momentum op- 
erator to 


in(O, 4 iA; (11.6.10) 


where e is of course the charge of the particle. In other 
words, the recipe is to replace the ordinary derivative ð, 
by the covariant derivative D,, . 


Py = —ihd, > 


AD, = 


We also remarked before that the Schrödinger or Dirac 
field is complex and therefore has a real and an imagi- 
nary part. And we furthermore made the point that there is 
always one overall phase that is unobservable and has no 
physical meaning therefore. Transforming that phase into 
another phase would not matter; it reshuffles the real and 
the imaginary parts of the wave function but the combina- 
tion of the two has exactly the same content. Neverthe- 
less, there is a phase symmetry because there is a phase 
transformation that leaves the physics invariant 


pi) > p(x) = ep (xy) . 


Furthermore, the equations with the interaction term also 
are invariant under this phase transformation. This trans- 
formation is often called a global, meaning space-time in- 
dependent gauge transformation. 


(11.6.11) 


Now we pose the interesting question whether these equa- 
tions are also invariant under local, which means space- 


time dependent phase transformations: 


W(xy) > p(x) = etb) (xy) . 


On first inspection the answer is no, because the equa- 
tions have derivatives that ‘see’ that space-time dependent 
phase factor and are going to make trouble about it be- 
cause: 


utp 3 Burp! = er) (9, + iD ra(xv) nb; 


and the transformed equation would be different because 
of this extra term involving the derivative of the space-time 
dependent phase. But wait a minute, what if we include 
the gauge potentials as we are supposed to do if we adopt 
the minimal substitution doctrine. Then we get: 


i . „e 
Dub > (Dup) = e) (3, + ið Laly) — iA. 


Now please observe a tiny miracle, if we just substitute 
the expression (II.6.9) for gauge transformed Als and make 
the judicious choice A = (h/e)a then net the effect of 
the two transformations is zero and we get that the gauge 
covariant derivative transforms exactly as we want, 


Dut > (Dup) = e™ Dp. 


It transforms ‘covariantly’ just like the field w itself and 
therefore the complete theory involving also matter fields 
becomes gauge invariant. This result implies that the equa- 
tions transform now simply by an overall local phase, which 
we can divide out and we have not changed anything. 


We conclude that the complete system of Maxwell equa- 
tions coupled to the Schrédinger or Dirac equations ex- 
hibits this local gauge invariance. 


Gauge connection and parallel transport. The gauge 
invariant part of the electromagnetic field are the E and B 
fields, or the components of Fuy . But as we have been dis- 
cussing already in the previous section on particle statis- 
tics and anyons there is a more subtle non-local quantity 
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that is gauge invariant, namely the Aharonov-Bohm phase 
factor or Wilson loop defined in equation (II.3.3). 


If there is curvature (field strength) then the transport be- 
tween point x9 and x; becomes path dependent. The lin- 
ear covariant equation: 


Dy w(x) = 0; 


has a general path dependent solution: 


e m 
-£| Aydx* 
pixi) =e Mr pxo). 
It looks quite daunting, but think of it as just a phase fac- 
tor, where the phase equals this integral of A, along the 
path, which is after all just a real number. This expression 
tells you precisely what parallel transport means: it tells 
you how the electromagnetic phase changes if you move 
in position space. And the covariant derivative in (II.6.10), 
is the infinitesimal version of that. The first term with the 
derivative generates a translation, while the second gener- 
ates the phase transformation. This also connects with the 
entries in the table on page 447, the exponent is a phase 
factor corresponding to a group element of the group U(1) 
which is just a circle. And A, is the connection one-form 
which takes a value in the Lie algebra which is just the 
phase itself. U(1) is one-dimensional group, and it is gen- 
erated by a ‘one by one hermitian matrix’: in other words a 
real number. 


The other point is that this ties in perfectly with our ear- 
lier observations in the previous section concerning the 
Aharonov-Bohm phase factor, as a means of measuring 
the magnetic flux up to multiples of the basic flux quantum 
27h/q. The remarkable aspect is that the path may en- 
tirely lie in a region where the electric and magnetic fields 
themselves are zero, yet the closed loop measures a non- 
trivial and gauge invariant quantity. It measures a topolog- 
ical aspect of the theory. 


We finally recall the other application of the parallel trans- 
port notion as a way to measure some Hamiltonian land- 
scape by means of the so-called Berry phase, as we dis- 
cussed in Chapter II.3. There, the notion of parallel trans- 
port was used to detect ‘curvature’ or ‘field strength’ differ- 
ences between a flat and curved surface. 


Charge conservation. We have emphasized over and 
again that one of the reasons why symmetry is important 
is that it corresponds to conservation laws. In fact there is 
a basic theorem by the German 19th century mathemati- 
cian Emmy Noether that to any one parameter continuous 
symmetry there is an associated conserved ‘charge.’ Lo- 
cal symmetries include the corresponding global symme- 
try and one therefore expects that the gauge symmetries 
will also correspond to conserved quantities. For the elec- 
tromagnetic gauge symmetry that is — not surprisingly — 
the local conservation of electric charge. 


A rather direct proof of this was already presented in the 
subsection on gauge invariance on page 33 of Chapter 
I.1. Recall that the interaction of the field with an exter- 
nal current gives a contribution to the Lagrangian density 
of Aju. So if we make the gauge transformation we get 
only one extra term which equals +ie(0"A/h)j,, in the La- 
grangian density, because the current itself is assumed to 
be gauge invariant. Invariance of the theory requires this 
extra term after integration over space-time to vanish. This 
in turn requires that the current has to satisfy 0"j, = 0 
which amounts to the local conservation of charge. This 
equation tells you that the change of the charge in that vol- 
ume exactly equals the current going through the surface 
bounding that volume. This is the relativistic form of what 
we in general call a continuity equation which is a local 
conservation law indeed. 


Turning arguments around. A question that you might 
have raised is whether we could have turned the argu- 
ments around and have said: let us impose this invariance 
under local transformations on the Dirac or Schrödinger 
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equations, what do we have to do? The answer would 
have been: you have to introduce a gauge potential A, 
that transforms in such a way that it absorbs the trouble- 
some extra term coming from the derivative. So introduc- 
ing gauge fields is a necessary consequence of imposing 
local gauge invariance. 


It was through arguments along these lines that in 1954 the 
physicists Chen Ning Yang and Robert L. Mills discovered 
the structure of non-abelian gauge theories that form the 
backbone of the acclaimed Standard Model. 


Non-abelian gauge theories II 

In this section we go through the steps that brought Yang 
and Mills to what must have been an incredible eureka mo- 
ment: the discovery of non-abelian gauge theories. 


Think of our familiar qubit as a column vector with two 
complex entries, but now we make it into a complex two- 
component spinor or doublet field, which we denote it by 
tp(xy) and we have the derivative 0, which can act on it. 
Next we want to make a field theory for wp that is locally 
gauge invariant. The first thing is to ask what invariance 
there is under constant or global transformations. Well, it 
is not just a single phase but it can be any unitary frame 
rotation U as we discussed for example in the Math Excur- 
sion on page 635 of Part III. Such rotations correspond to 
elements of the group SU(2) , and we learned that any el- 
ement of the can be written as the exponent of an element 
of the su(2) algebra which is a linear combination of the 
Pauli matrices: 


By construction C is hermitian (Ct = C) and U therefore 
unitary (Ut = UT!) . Now we want to repeat the exer- 
cise we did for the phase factor with this matrix valued 
‘phase’. 


Gauge covariant derivative. First we observe that the 
derivative has still no problem with the constant complex 
rotation by which we mean that the three components of y 
are constant. But what if the parameters become space- 
time dependent, if we write y = y(xv), and look what hap- 
pens with at the two-component derivative if we transform 


W(xy) > U(x p(x) 


(Dup) = (10, + iqAy)p > 
(Dup) = (10, + qA UY 
= U(10, + UTILU + iqU ALU) 
=U (Dup). (11.6.12) 


In the first line we should now think of the covariant deriva- 
tive as a matrix where the derivative is multiplied with the 
unit matrix and A is some matrix with a structure we are 
about to determine. The strength of the coupling between 
the A and wp fields is given by the charge q. In the inter- 
mediate line we have inserted the trivial factor UUT! = 1 
in front, in order to obtain the expression in the desired 
form, which appears in the bottom line. But that expression 
only holds if the gauge field A has the interesting structure 
which is more or less dictated by the derivative term: 


Uo, U=U Oy) TU. 


Because the factors U, U7! and T are matrices they do not 
commute and one cannot just change the order in which 
they appear in an expression. 


Lie algebra valued gauge fields. Apparently this deriva- 
tive brings down the Lie algebra element and takes the 
derivative of that, and the result of this gets rotated by the 
U factors around it. The upshot is that this non-abelian 
gauge field has to be an element of that same Lie algebra 
SO: 

Ay =A, -T 


and it has to transforms like: 


Ap aA At 5 (20 wu. 
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For the case at hand the conclusion is now clear, the gauge 
field itself has to be an element of the Lie algebra in this 
case su(2), and has to transform like a connection. The 
is algebra is three-dimensional as it has three indepen- 
dent generators, and consequently there are three inde- 
pendent gauge fields needed, which represent three dif- 
ferent gauge particles. 


Principle fiber bundles. The appropriate mathematical 
setting of gauge theories is that of fiber bundles, as we 
discussed already in the section on the ‘Physics of geom- 
etry’ on page 78 of Chapter 1.2. These bundles are defined 
as a triple {E, M, 7} corresponding to a bundle space E, a 
base manifold M (which would be our space-time mani- 
fold) and a gauge (or structure) group G . The dimension 
of E equals the sum of the dimensions of M and G. And 
the space E looks locally like a tensor product M & G , but 
can be different globally, in which case we speak of a non- 
trivial bundle. Given is a projection n from E onto M , and 
the inverse of that projection at a point x, € M gives you 
the fiber above that point which is a copy of (isomorphic to) 
G . Choosing a smooth section, meaning that you choose 
a particular group element out of each fiber, produces an 
explicit form of a gauge covariant derivative on M . Gauge 
transformations are related to the changing of sections of 
the bundle. 


This setting allows you to naturally define topologically non- 
trivial gauge field configurations that can be characterized 
by topological invariants like the Chern classes. Deep re- 
sults relevant for physics were obtained. For example, a 
variety of the so-called index theorems, like the Atiyah- 
Singer index theorem, that links the topological invariant of 
the gauge field configuration to the net number of left- ver- 
sus right-handed solutions of the zero-mass Dirac equa- 
tion coupled to that (background) field. Interestingly the 
Yang-Mills equations were not considered before they ap- 
peared in the physics literature, and only afterwards be- 
came a major mathematical topic in the 1970s. 


Once more the Standard Model. We mentioned that the 
number of gauge particles is equal to the dimension of the 
Lie algebra, which is just the number of independent pa- 
rameters or generators. But the argument does not de- 
pend on the particulars and basically holds for any gauge 
group, including the groups U(1),SU(2), and SU(3) that 
appear in the Standard Model. The weak and electromag- 
netic interactions have the gauge group SU(2) x U(1), 
where the charged W= bosons correspond to the raising 
and lowering operators T+, while the photon and the neu- 
tral Z boson are linear combinations of the neutral W°? bo- 
son and the Y boson associated with the U(1) factor of 
the gauge group. The three W bosons correspond thus 
with the three-dimensional (iso) spin 1 representation in 
Figure 11.6.2, while the fermionic quarks and lepton fields 
form doublets corresponding to the (iso) spin-1/2 repre- 
sentation. 


Colors and Flavors. Quantum Chromodynamics (QCD), 
the theory for the strong interactions, has gauge group 
SU(3), which has dimension eight. The eight gluons cor- 
respond with the weights of the root diagram (including 
two zero weights in the center) as shown in Figure II.6.6. 
In this figure we have also marked the color (anti-)triplet 
representations corresponding to the weights of the (anti-) 
quark fields.” 


At this point you may experience a deja vu moment, be- 
cause Figure 1.4.33 in Chapter 1.4 flashed back in your 
mind which indeed looks very similar to Figure II.6.6. Yes, 
true, but it actually refers to a very different context. There 
we were talking about the flavor symmetry, the classifica- 
tion scheme discovered by Gell-man and Zweig. It is in- 
deed also an SU(3) symmetry, and it also applies to the 
quarks but on the other hand it is a very different type 


5The gluon circles carry a quark and anti-quark color, and we have 
given the anti-quarks the anti- or better complementary color in the 
figure. In Figure 1.4.36 the gluons are also bicolored but there both 
the quark and antiquark have the same color but have arrows in the 
opposite direction. 
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of SU(3) symmetry. Firstly, it is not a gauged symmetry, 
but instead an approximate global or rigid symmetry, so 
there are no gauge particles associated with it. And as 
the quarks of different flavors have different masses it is 
indeed only an approximate symmetry, because the parti- 
cle states are not really degenerate. Our knowledge at this 
point suggests that this symmetry is accidental, and once 
you accept that it is only approximate you may as well de- 
clare that there is a SU(4) or even SU(6) flavor symmetry. 
This would be the case if you in addition take the charm, 
top and bottom quark flavors along. Anyway, the physics 
related to these two SU(3) groups is entirely different: the 
flavour symmetry is manifest in the spectrum of observed 
particles, as the figure in Chapter 4 shows. The mesons 
for example belong to an octet and these are free particles. 
The color property of particles is hidden because of the 
confinement phenomenon which only allows color neutral 
or singlet states to be free particles. This made it so hard 
to uncover the color symmetry in the first place. 


Color singlets. The singlet property has to do with con- 
structing colorless combinations of quarks (and gluons). 
This requires that we look in the possible multi-quark spec- 
trum for those combinations which have that property. Here 
| recall the fact that multi-particle states are described by 
so-called tensor products of single particle Hilbert spaces. 
The single (anti-)quark color states form a color (anti-)triplet 
representation denoted as 3 and 3 respectively. The tensor 
products can be split up again in irreducible components 
or representations. Like for example: 


3x3=14+8 


3x3=3+4+6 
3x3x3=1+8+8+10. 


(11.6.13) 


The dimension of the tensor product space is the product 
of the dimensions of the two factors. The weights of the 
tensor product states are obtained by adding the weights 
of the individual representations. This you may verify in 


the SU(3) weight space of Figure II.6.6. What is clear from 
equation (II.6.13) is that the simplest ways to make a color 
singlet ‘1’ representation is by combining a quark and an 
anti-quark, making a meson, or making a particular com- 
bination of three quarks making a baryon. 


Is Einstein gravity a gauge theory? So we have found 
that the gauge symmetry principle underlies the particular 
way the force carrying particles appear in nature. Does this 
trick then also work for the gravitational force you may won- 
der. Yes indeed, it does! One interesting way to interpret 
the Einstein theory is actually to look at it as a gauged ver- 
sion of the combined /oca/ Lorentz and translation groups, 
usually referred to as the Poincaré group. So in this per- 
spective the Einstein equations are an expression of a lo- 
cal Poincaré symmetry. 


Kaluza—Klein theory. You could also argue the opposite 
way and say that the E and B fields, the field strengths 
of electromagnetism, correspond to electromagnetic ‘cur- 
vatures’ of some internal space that is defined in every 
point in space-time. Yet another way to understand it is to 
say that space-time has in fact extra spatial dimensions, 
which have particular geometries corresponding to circles, 
spheres or group manifolds for that matter. These compact 
extra spaces are then squeezed to zero size, by a proce- 
dure called ‘dimensional reduction’ or ‘dimensional com- 
pactification’ This remarkable idea in fact goes back to 
the early days of general relativity where Theodor Kaluza 
and Oskar Klein proposed to unify electromagnetism and 
gravity in a five-dimensional theory using this symmetry 
principle. 


The proper mathematical setting for the classical versions 
of gauge theories is that of fiber bundles with some Lie 
group G or representation thereof as fibers, as we intro- 
duced them in the section on The physics of geometry’ 
on page 78 of Chapter 1.2. These geometric structures at- 
tracted the attention of the physicists only long after the not 
so geometric Maxwell, Einstein and Yang-Mills equations 
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were written down. In fact the formalisms were developed 
to a large extent independently in physics and mathemat- 
ics. 


Non-abelian field strengths. You might complain that | 
am choking my highly esteemed readers with math, but to 
my defence | would argue that we have exposed some of 
the core ideas of modern physics, in only a few pages , and 
even without too much cheating! In fact the 1954 paper of 
Yang and Mills is just a short article that appeared in the 
Physical Review Letters (PRL) journal, and its influence is 
inversely proportional to the length of the paper. There is 
an ironic aspect to that paper, since the authors in fact pro- 
posed that this non-abelian gauge theory should describe 
the ‘pions’ as these particles were at that time believed to 
mediate the strong nuclear interactions. This idea didn’t 
work out at all, and so these beautiful equations went into 
the ‘fridge, and it took about 15 years before they were 
taken out again and found their true vocation in the Stan- 
dard Model as we have described it.® It is one of those rare 
occasions where the elegance and beauty of an idea make 
it irresistible and fortunately also inescapable, so one just 
had to wait for it to find its proper place. 


You might object by noting that the Kaluza—Klein idea of 
dimensional compactification apparently has not properly 
landed, in spite of being attractive and elegant as it ‘pro- 
duces’ gauge fields with the correct interactions. The K- 
K approach returned as a necessary ingredient of string 
theory, but nevertheless has not yet found its true voca- 
tion, and | am afraid it has to spend some more time in 
the ‘fridge’ Science is patient and even if an idea clearly 
‘does not work, it is extremely hard to put stickers stating 
‘Consume before date indicated on the bottom’ on ideas. 


6A hallmark of great institutions is not only that they attract extremely 
gifted people, but also that they are the guardians of research fields, 
keeping alive a collective memory of failed attempts and almost forgot- 
ten, unsolved problems; of all that ended up in the ‘fridge of ideas’ so 
to speak. 


SU(3) roots and weights 


T 
8 gluons 


Figure II.6.6: SU(3) roots and weights. In this figure we rep- 
resented the root diagram of SU(3). with the 6 non-zero roots 
given by the green arrows. The gluons form the 8 representation 
corresponding to the six non-zero roots and the two in the cen- 
ter, marked by the bi-colored circles. Then there are the triplet 


(3) and the anti-triplet (3) representations corresponding to the 
three colored (anti-)quarks. 


The Yang-Mills equations 


So are we done? No, not quite, we have to check one other 
thing: what will happen to the analogue of the Maxwell 
equations for the gauge fields? And what happens to the 
electric and magnetic fields, so nicely encoded in the field 
strength Fav, if we go non-abelian? Two remarks are to 
be made, (i) as F is linear in the gauge field it also will live 
in the Lie algebra and should therefore simply transform 
as F + F = UFU”! and (ii) this is only achieved if the 
definition of F for the non-abelian theories is generalized in 
a logical and elegant way to: 


Fav = PA= By Ay + TTA, Ay) (11.6.14) 


logical and elegant because the commutator is antisym- 
metric in the indices and also keeps you in the Lie alge- 
bra. An equivalent, more covariantly looking definition is to 
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say that Fuy = -i [Dn D,]. The extra commutator term in 
the field strength has huge physical consequences it turns 
out. 


Clearly the definition of the non-abelian electric and mag- 
netic fields are nonlinear in the potentials, and this means 
that the Yang—Mills equations, which are the generaliza- 
tions of the Maxwell equations to the non-abelian case, 
are nonlinear as well. The Yang—Mills equations really are 
the dynamical expression of non-abelian gauge symmetry. 
These equations take the following form: 


DP Fay = Of Fuy + iglA", Fuy] = 0. 


They are strongly nonlinear indeed, in the first place be- 
cause the definition of the field strength is non-linear in A , 
and secondly because of the presence of the commutator 
term of A with F in the equation itself. 


Symmetry dictates the structure of interactions. The 
non-linearities mean that the theory is self-interacting right 
from the start. Whereas photons don’t see each other, 
gluons do, as we already showed in Figures 1.4.36 and 
1.4.37. We have reproduced the latter here to take a closer 
look at how it connects to the more detailed description of 
non-abelian gauge theories we have given. 


The local Lagrangian density is a Lorentz invariant expres- 
sion for non-abelian gauge fields (gluons) coupled to Dirac 
fermions (quarks) and looks deceivingly simple: 

L(Xu) = Na +ipDp, (11.6.15) 
with P = y"D, the Lorentz invariant Dirac operator as it 
works on a four-component Dirac field p(x). In Figure 
Il.6.7 we see two interaction vertices: on the left we see a 
self-interaction of the gauge field corresponding to the third 
order term in A from the F? term in the Lagrangian and 
on the right we see the gauge field interact with the Dirac 
field corresponding to the cubic interaction term from the 
covariant Dirac operator in the Lagrangian. There is a lot 


Figure 1.6.7: Color-flow diagram in QCD. A nice way to visu- 
alize the interactions in QCD. Quarks carry a single color line, 
while gluons carry two (different) lines. In the vertices the color 
charge is conserved, so, the colors and arrows have to match. 
The upper index goes into the vertex, the lower index goes out. 


of index gymnastics hidden in the notation however. This 
becomes evident if we for example write out the latter term 
in glorious detail. It looks quite horrendous: 


Abo iv AWTS po). 


= (11.6.16) 


There are a few remarks to make with respect to this intri- 
cate expression: 


(i) the interaction is local as all fields depend on the same 
space-time point x, ; 


(ii) all fields carry a space-time index that tells you how they 
transform under Lorentz transformations, and a gauge in- 
dex that tells you how it transforms under gauge transfor- 
mations; 


(iii) the Dirac fields carry two indices, a space-time spinor 
index i with i,j = 1,...,4, and a ‘color index a with 
a,b = 1,...,n with n the dimension of the color repre- 
sentation (n=3 for QCD); 
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(iv) the four gamma matrices carry a space-time (vector) 
index and each of them is a matrix in spinor space and 
has therefore two spinor indices i, j ; 


(v) the gauge field has a space-time index u and a gauge 
group index y with y = 1,..., dimA (dim A = 8 for QCD); 


(vi) the representation matrices or generators T carry a 
gauge group label y and each of them is a matrix in the 
representation space, thus with two indices a, b; 


(vii) all indices are pairwise contracted, and thus have to 
be summed over. This amounts to making invariant in- 
ner products in the spaces the indices refer to. In a sense 
the expression is therefore extremely simple because once 
you know what the symbols stand for, there is a strict logic 
which tells you where to put the various indices. It is dic- 
tated by the requirement of invariance of the interaction 
under independent changes of basis in either space-time, 
or spinor space, or in the Lie algebra or group representa- 
tion spaces; 


(viii) It is these delicate balancing act of indices that is for 
example reflected in the way the ‘color ‘lines in the dia- 
gram of Figure II.6.7 are strictly continuous through the 
vertices. 


Some people might say that it is ugly to exhibit all these in- 
dices, while others say that that is exactly what makes the 
very beauty of the construction manifest. The ultra com- 
pact notation of equation (11.6.15) demonstrates how effec- 
tive the symbolic notation is that the physicists have devel- 
oped over the years. The expression (II.6.16) in contrast 
shows very explicitly how a particle in fact lives in many 
spaces simultaneously, all with their own indices and met- 
rics. All of us agree that to do real calculations you have 
to go all the way down into this index jungle, it is a must, 
a conditio sine qua non! And once you realize in addition 
that this is only the lowest order interaction diagram you 
can imagine that it takes a fully dedicated PhD researcher 
to complete a single higher order calculation of some phys- 
ically relevant proces that is measured in an accelerator. 


Such calculations involve hundreds or even thousands of 
diagrams to be added to get the full probability amplitude 
for the process. The actual execution of such calculations 
involves nowadays high level Al in large scale computing 
efforts and it is thanks to the rigorous underlying symme- 
try structures that these calculations can be automated to 
such a large extent. 


Self-interactions and the confinement problem. Free 
fields are sometimes not as free as one would think. And 
this in turn makes perturbative approximations dangerous, 
which basically means that you start with setting the cou- 
pling strength q to zero, and then take only low orders of q 
into account. The problem is that if a field is self-interacting 
the theory becomes nonlinear and may end up in a phase 
which is entirely different from what you naively would ex- 
pect. The relevant or observable degrees of freedom can 
be very different from the degrees of freedom you started 
out with. For example the enigmatic problem of quark con- 
finement can be traced back to the self-interacting nature 
of the gluons. Free quarks have never been observed, be- 
cause they are doomed. They have to spend their whole 
life as a pair, or a ménage a trois but always confined within 
a hadron. 


Understanding and proving these quantum confinement 
properties of Yang-Mills theories from first principles is still 
an open question and is one of the Millennium problems 
in mathematics. It is a problem that attracts the minds of 
brilliant mathematicians and theorists because it is a very 
well-defined problem. The starting point is a familiar object 
called a non-abelian gauge theory, or a principle fiber bun- 
dle with a compact structure group. The quantum problem 
to be solved is: prove the conjectured confinement prop- 
erty of the ‘color-electric’ fields. That this property holds 
has been demonstrated by numerous computer simula- 
tions of the theory, where the theory is formulated on a 
discrete space-time lattice but that amounts basically to a 
study of the strong coupling (large q) limit of the theory. 
This is basically a perturbative approach in 1/q. And in 
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that limit the theory does confine, but to settle the ques- 
tion one has to prove that there is not a phase transition 
between the strong and weak coupling regime. 


From experience — think for example of Fermat's last con- 
jecture — we know that such conjectures can linger around 
for centuries before they finally get turned into a theorem. 
A humble observation is that making the conjecture al- 
ready can make you famous. So the least we can say is 
that we are exploring deep waters! 


The conclusion is that the principle of local gauge invari- 
ance provided a valuable clue to the construction and un- 
derstanding of the fundamental equations underlying the 
Standard Model. E E 


The symmetry breaking paradigm 


Having argued that symmetry principles play an important 
role in modern-day physics, the same can be said about 
the concept of symmetry breaking which has found many 
beautiful and surprising applications in basic high-energy 
physics as well as in many branches of condensed mat- 
ter and molecular physics. Where symmetry unifies states 
and makes them degenerate, it is the breaking of symme- 
tries which creates non-uniformity and diversity. We are 
going to explore some typical cases which illustrate the 
power of this quantessential idea. 


Symmetry breaking in objects. It is paradoxical that 
| first let you suffer by talking so extensively about how 
beautiful symmetries are, and then immediately after con- 
front you with how to break them. It is like a small child 
building a beautiful tower from woodblocks and then de- 
stroying it while screaming and dancing around it. Appar- 
ently there is some thrill in the act of destruction! Let us 
look for similar thrills, and first go back to the ‘symmetries 
of objects’, like an equilateral triangle, a circle or a sphere, 


Figure II.6.8: The breaking of symmetries. Breaking symmetry 
by deformation of an object. The D3 symmetry of the equilat- 
eral triangle (with six elements) gets broken to a Z2 of isosceles 
triangle (with two elements), which subsequently gets broken to 
the trivial group for an arbitrary triangle. 


and then it is not hard to imagine how to break the sym- 
metry. 


For example you could squeeze the object one way or an- 
other as to reduce its symmetry. You could do it step wise 
like in Figure 11.6.8, where you first go from an equilateral 
to an isosceles triangle, and then to a generic one. In that 
case you first pass from the discrete group D3 with six ele- 
ments (3 rotations and 3 reflections) to the group Z2 of two 
elements (the identity element and a reflection), and in the 
second step you end up with no symmetry at all: you are 
left with only the identity element. Breaking has the prop- 
erty that the residual symmetry group after breaking is just 
a subgroup of the original symmetry group. 


If you squeeze a ball top down, you typically get an ellip- 
soid, where the symmetry is reduced to rotations around 
the vertical axis only, and a reflection symmetry through 
the horizontal plane and vertical planes through the center. 
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If you then make it into a standing egg shape, you lose the 
reflection property in the horizontal plane but still Keep the 
vertical symmetry axis, and so on. By the way this makes 
you wonder why eggs have the shape they have. Why not 
celebrate the perfection of life in perfect spheres? One 
reason that has been given is that egg-shaped objects do 
not roll away, if you put them on the table and push them 
away they tend to ‘boomerang’ in a little circle. ‘They like to 
stay near their starting point!’ | hear my mother say. And 
maybe the biology of how to lay an egg — to push it out 
by contraction — plays a role as well in the optimal egg de- 
sign. What came first, the egg or the design? This is not 
even a ‘chicken or egg’ question, instead, this is an ‘egg or 
egg’ question. Anyway, more a topic in evolutionary biol- 
ogy than in quantum physics | fear, so it is better to leave 
it to the cloaca experts. The shapes created by symmetry 
breaking are more and more diverse and need more and 
more parameters to specify. In that sense their information 
content and therefore entropy increases. And many will 
say that with that their beauty increases as well. 


Symmetry breaking by solutions of equations. The 
next step up is to talk about the symmetry of equations, 
and the first question that comes to mind is what do the 
solutions of equations with symmetries look like? Do they 
indeed manifestly exhibit the symmetries of the equations? 
The answer is clearly: No! Think of our nice Newtonian 
example again. The great step forward was exactly to dis- 
cover and understand that the planetary orbits are not cir- 
cles or even epicycles, but conic sections, ellipses, parabo- 
las and hyperbolas. So, where did the spherical sym- 
metry of the gravitational field around the sun go, which 
is so clearly present in the equations? Why and where 
does the immaculate perfection of the heavenly spheres 
get lost? 


A little thinking yields the answer: the symmetry is still 
there. But the symmetry transformations act on the space 
of solutions. What they do is that given a particular so- 
lution, and acting with a symmetry operator on it, it will in 


Figure II.6.9: Action of rotational symmetry on an elliptic orbit 
solution. The Newtonian Earth—Sun system has spherical sym- 
metry but that symmetry is not manifest in a particular solution, 
like for example an elliptic orbit. The symmetry transforms dif- 
ferent equal energy solutions into each other. 


general generate a different solution. The symmetries map 
solutions onto each other, and as they keep the equations 
fixed, they transform solutions with equal energy into each 
other. With the rotations that is quite obvious, as we have 
illustrated with the elliptic orbits of the spherically symmet- 
ric Newtonian sun-earth system in Figure 11.6.9. It turns 
out that the Runge—Lenz symmetry changes the eccen- 
tricity of the elliptic orbit and that is not so obvious. It is in 
this sense that you may say that most particular solutions 
break the symmetry of the equation, and the symmetry 
acts in the space of solutions. It creates a subspace of de- 
generate solutions in the space of all solutions. That space 
gets ‘stratified’ according to its energy values and solution 
shapes. 


This brings us in fact close to the observations we have 
made with respect to the role symmetries play in quan- 
tum theory, labeling the degenerate states but also mov- 
ing (stepping) between them. They walk you through the 
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degenerate subspace of the total sample space of your fa- 
vorite framework. 


Symmetry breaking in the atom. Symmetry breaking is 
an important concept. What does symmetry breaking look 
like in a quantum setting? Imagine that we have a symme- 
try, then we could make that symmetry visible by ‘breaking 
it. In other words by adding a term to the Hamiltonian 
that explicitly breaks the symmetry. For example we put 
an atom in a magnetic field say along the z-direction, then 
there will be an extra term in the Hamiltonian proportional 
to L, and the magnitude B of the magnetic field. Now the 
three-dimensional rotational symmetry is broken to rota- 
tions around the z-axis only. The consequence is that the 
energy levels which where at first degenerate and there- 
fore hard to distinguish will now split up proportional to the 
value of their magnetic quantum number m. This is the 
famous splitting first observed by Pieter Zeeman we dis- 
cussed in Chapter 1.4. This is an example of explicit sym- 
metry breaking where we change the Hamiltonian. But 
also in quantum theory we can have the phenomenon of 
spontaneous symmetry breaking which refers to a situa- 
tion where we change external parameters of the system 
— say the temperature or a coupling — such that the Hamil- 
tonian itself does not change and still has all the symme- 
tries, but it is the ground state that changes to one in which 
the symmetry is broken. 


Low energy modes. This brings us to a follow-up ques- 
tion: what happens if the ground state is not invariant and 
does not respect all the symmetries? In other words, what 
if the ground state breaks the symmetry? Well, by what we 
argued above, it will then necessarily be the case that that 
ground state is not unique and itself degenerate. If that 
ground state breaks a continuous symmetry, we will have 
a continuous set of equivalent ground states. And what 
that means is intuitively quite clear: the system can easily 
move from one ground state to one nearby and it would 
cost basically no energy. 
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Figure 11.6.10: Long range orientational order. The collective 
of wheat plants is in a state that exhibits a long range order. Bij 
growing out of spherical seeds, the original rotational symmetry 
is broken. 


Saying it yet differently, the generators of the symmetries 
that are broken create ‘zero (energy) modes’ of the sys- 
tem. This is an important physical signature of broken 
symmetry: the appearance of low energy modes in the 
system that are easy to excite. And if we talk about (rel- 
ativistic) field theory where the energy includes also the 
mass, our observation asserts that there will be massless 
particles around. Such particles are called Goldstone par- 
ticles or modes, after the MIT physicist Jeffrey Goldstone 
who discovered the mechanism. Ideally these modes are 
exactly massless, but there can be additional effects that 
give those particles a mass. However that mass should 
be small compared with the scale of the interaction energy 
that caused the breaking. 


Think of wheat seeds, if we assume them to be spheri- 
cal, spreading them on a field gives a ‘ground’ or better 
‘down to earth’ state that is rotationally invariant, which 
means to say that we can rotate each of the seeds by 
the same amount and nothing will change. Now we wait 
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Figure II.6.11: Wheat waves as low energy excitations. The 
wheat field at rest shows tha long-range order typical for a bro- 
ken symmetry. It has low energy excitations. These are the 
‘wheat waves’ that propagate easily and can already be excited 
by a gentle breeze. 


a month or more, and the seeds turn into plants nicely 
growing up, all beautifully lined up vertically, so the ground 
state has changed to a ’field’ in a completely ordered state 
that certainly is a state of broken symmetry. There is a 
spontaneous, average length which is non-zero, and fur- 
thermore a long-range vertical orientational order in the 
system which breaks the original spherical symmetry (see 
Figure II.6.10). 


Now where is the zero mode? Those modes correspond 
to what you get if a light breeze goes over the field and 
you see gentle plane waves traverse the wheat plants (see 
Figure 11.6.11). It is a low energy collective mode that orig- 
inates in the broken symmetry of the ground state. Amus- 
ing and playful for sure, but we better take it serious be- 
cause there are many examples of this so-called Gold- 
stone mechanism, from spin waves or magnons in mag- 
nets, to the appearance of three nuclear particles known 
as pions, n™ and mo we have mentioned in Chapter 1.4. 


Z Massless or zero mode -- flat direction 


RA Massive mode -- curved direction 


| 
7 


Figure II.6.12: Breaking of global symmetry. The breaking of a 
U(1) global symmetry leading to a ‘Mexican hat’ potential. The 
minimum is not unique but there is a continuum of ground states 
forming a circle. The breaking leads to one massless and one 
massive mode as indicated in the figure. 


Chiral symmetry breaking. A famous application of the 
symmetry breaking concept is provided by the three pion 
particles (n+andno). The interpretation is that they are the 
Goldstone particles associated with what is called chiral 
symmetry breaking. It refers to an ingenious scenario pro- 
posed by Japanese/American physicist Yoichiro Nambu, 
who indeed received the Physics Nobel prize in 2008, for 
— | quote — ‘the discovery of the mechanism of sponta- 
neous broken symmetry in subatomic physics. The sce- 
nario starts with massless up (u) and down (d) quarks. 
These are described by massless Dirac equations, but the 
massless Dirac equation can be split into two non-inter- 
acting pieces, the right (R) and left (L) polarized compo- 
nents. Said differently, it is precisely the mass term in the 
equations that couples the left to the right polarized com- 
ponents. If you look at the tables of the standard model in 
Figure 1.4.35, you see that there is the horizontal so-called 
isospin symmetry between u and d quarks. This means 
that the massless equations have an SU(2), symmetry 
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transforming u and dı into each other (so they form an 
isospin one-half representation), and an SU(2)p symme- 
try transforming the right-handed components up and dr 
into each other. So at this stage the model has a six-- 
dimensional G = SU(2), & SU(2)r symmetry. This is 
called the chiral symmetry group, which is a global sym- 
metry. Nambu suggested that a quark anti-quark conden- 
sate forms spontaneously, so that the particular diagonal 
combination of fields u and d becomes the order parame- 
ter and acquires a vacuum expectation value: 


(p) = ((ūr ur + dı dr)) = fxr 40 


Now this condensate breaks the symmetry G , but not com- 
pletely. What is left you can see from the condensate, 
namely, if we simultaneously transform left and right then 
the condensate is invariant. This in turn tells us that from 
the six generators a particular ‘vector like’, ‘left plus right’ 
SU(2) subgroup survives, while the rest, the three ‘left mi- 
nus right’ generators, will be broken. These give rise to 
three Goldstone particles with exactly the quantum num- 
bers that correspond to the three pion particles. The fact 
that these particles in the end do have a relatively small 
mass is accounted for by the fact that the masses of the 
quarks were not quite zero to start off with. 


The breaking systematics. In the chapters on condensed 
matter physics we will return to this topic of symmetry break- 
ing in the context of many body physics. The general 
picture boils down to a situation where the theory has a 
continuous symmetry group G of dimension dim G, and 
some field gets a non-zero ground state expectation value. 
That particular vacuum state is only invariant under a resid- 
ual symmetry group K C G which is a subgroup of G. 
Then there will be dim G — dim K broken symmetries 
and therefore the same number of Goldstone modes. The 
field that acquires the non-zero expectation value in the 
ground state and breaks the symmetry is called an order 
parameter field. The nomenclature is that the broken state 
is the state in which everything is neatly lined up some way 
and therefore exhibits ‘order’, where order is defined as the 


presence of long-range correlations in the medium. 


Ferromagnetism. As an example think of a metal where 
all the nuclear magnetic spins in the absence of an exter- 
nal field are pointing in random directions in the medium, 
and therefore there is no over-all magnetization, and no 
macroscopic direction of the magnetic field is discernible. 
If one then lowers the temperature below what is called 
the Curie temperature, the thermal energy gets so small 
that the weak interaction between the tiny magnets starts 
to become dominant and the spins minimize their energy 
by lining up and thereby ‘spontaneously’ make a magnet. 
So, by cooling down a metal spontaneous magnetization 
occurs and conversely, by heating up a magnet to high 
temperature it will lose its magnetization and the symme- 
try will be restored. Spontaneous magnetization serves 
as the prototype of spontaneous symmetry breaking in a 
many-body system. And indeed the low energy modes are 
just the spin waves which are easy to excite in a magne- 
tized medium. 


Topological defects. In Volume III of the book we will ad- 
dress another crucial aspect of symmetry breaking, which 
is the appearance of what are called topological defects. 
Defects are collective excitations which are usually ‘heavy’ 
and not so simple to excite, but once they exist they are 
equally hard to get rid of. A dramatic instance you are all 
familiar with from watching the news is the phenomenon 
of tornadoes or vortices in liquids. In that case there is 
a ground state that is symmetric if there is no wind, but 
when a wind starts blowing there is at once everywhere at 
any given point in space we have a local vector pointing 
in the direction of the wind. On the surface of the earth 
we can think of the non-zero two-dimensional vector field 
representing the wind as an order parameter. As a con- 
sequence of some ‘massive’ obstacles it may happen that 
somewhere a pair of vortices with opposite vorticities is 
created, and once these get well separated they are highly 
stable objects. As a matter of fact you cannot destroy sin- 
gle vortices by locally disturbing them, you have to wait 
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till their energy gets dissipated, for example by causing a 
lot of damage. There are many examples of remarkably 
stable collective excitations in all kinds of fields of science 
and technology that can be thought of as topological de- 
fects that originate in a state of broken symmetry. 


Hidden gauge symmetries: the Higgs particle. So far 
we have only looked at rigid or global symmetries: we con- 
sidered transformations that were the same at any point in 
space and we found the remarkable directly observable 
phenomena of low energy Goldstone modes and high-en- 
ergy defects as hallmarks of their breaking. The next ques- 
tion that naturally arises in field theory is what happens if 
we somehow ‘break’ a local gauge symmetry? You may 
think of the U(1) gauge symmetry of electrodynamics, or 
the SU(2) gauge symmetry of the weak interactions. Again 
this may happen spontaneously, meaning to say that the 
system of equations still has the full symmetry, but that the 
solution, in particular the ground state, does not. The first 
question to answer is whether this can be done at all. Is 
it possible to maintain the local gauge symmetry and yet 
have a ground state in which some field acquires a non- 
zero expectation value? The answer turns out to be ‘ap- 
proximately yes. A first example was exhibited by Landau 
and Ginsberg in their effective description of superconduc- 
tivity. Later it was understood and explained in full detail 
in the modern theories of Bardeen, Cooper and Schrieffer, 
and later Anderson, about which we have more to say in 
Chapter III.3. 


The Brout—Englert—Higgs (BEH) mechanism 


A beautiful example of the spontaneous breaking of a non- 
abelian gauge symmetry is the Brout-Englert—-Higgs mech- 
anism, accounting for the heavy mass of the weak force 
mediating W+ and Z° particles in the weak and electro- 
magnetic interactions, and more indirectly for the existence 
of the Higgs particle. Let me illustrate how that comes 


about in a simpler model due to Sheldon Glashow, without 
going into much detail. 


Breaking in an SU(2) model. Let us consider an SU(2) 
(or SO(3)) gauge theory coupled to a ‘matter’ field that 
transforms like a triplet or iso-vector under the gauge group. 
This means that we should think of the gauge field as 
Aula, where the Ta are now the three 3 x 3 matrices 
generating the SO(3) symmetry. It has three gauge par- 
ticles (like the W-bosons we discussed before) because 
the group is three-dimensional. The ‘matter’ field @(x), is 
a triplet of space-time scalar fields, that transform like a 
3-dimensional ‘iso-vector’ under the SO(3) gauge group. 
In the quantum context the field p(x) would therefore de- 
scribe three types of scalar particles. Let us now assume 
that this field @(x), or rather its square which is gauge 
invariant, develops a constant vacuum expectation value 
(lp?) 4 0. So a condensate forms. The situation is simi- 
lar to the magnets we just discussed, but now we think of 
it happening in some internal space where the force field 
is active, and where the ¢ field describes an iso-vector 
degree of freedom at every point in space. 


As long as the vacuum expectation value vanishes the 
symmetry is not broken, but if the iso-vector is non-zero, 
and chooses some fixed direction it is like a wheat field 
and the non-zero vector field is only invariant under rota- 
tions around the axis in the direction in which the nonzero 
iso-vector points, corresponding to an SO(2) subgroup of 
dimension one. So we expect there to be two massless 
Goldstone particles, like in the case we discussed before. 
But now in addition we have the gauge fields that are cou- 
pled to this iso-vector through a covariant derivative. The 
question is then what the effect of the vacuum expectation 
for the scalar field has on the gauge fields. The resulting 
mechanism is powerful and quite universal. 


To see what happens we write for the iso-vector (and think 
of it as a three component column-vector) in the ‘broken’ 
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phase: 

(x) = po êz + h(x), 
where dp is the constant non-zero vacuum expectation 
value pointing in the third direction of iso-space and the 
delta describes the field fluctuations around that ground 
state value. Now the interactions are generated by the co- 
variant derivative: 


Du = (10, + iqA% Ta) Q, 


where the T, are the three generators of rotations in iso- 
space. At this point the crucial observation is that there 
are two components of the gauge field that ‘see’ or sense 
the vacuum value, while the third component does not be- 
cause it linked to the generator of the residual symmetry 
which leaves the condensate unchanged. The Lagrangian 
density £ of this theory contains a term proportional to 
(Duo). Of interest here is only what the effect is of the 
constant do in the Hamiltonian. When you work out the 
interaction between the gauge field and the vacuum term 
you discover that it leads to a quadratic term proportional 
to |do|? of the form: 
AL = Ipo ((A!)? + (A*)); 

and this is exactly what a mass term for the two compo- 
nents of the gauge field would look like. Apparently we 
have generated a mass for two of the three force carrying 
particles, a mass proportional to the non-zero expectation 
value do. So, we end up with one massless force com- 
ponent (Aĉ), which is long-range like the photon, and two 
massive force particles A! and A? . The latter two can be 
recombined in the components A= which are charged with 
respect to the massless A? field. Because of their mass 
these fields mediate a short-range interaction described 
by a Yukawa potential as we explained in Chapter 1.4. They 
would be the lookalikes of the W= particles. What we just 
described amounts to a simplified analogue of the Brout- 
Englert-Higgs mechanism in the Standard Model, which 
indeed explains the masses of the W and Z bosons me- 
diating the weak interactions, and the photon remaining 
massless. 


Searching for the Higgs. The remaining question is where 
does the celebrated Higgs particle reside in this scenario? 
| have not yet mentioned it. To understand its origin we 
have to do some counting of the degrees of freedom of the 
particles before and after the condensate forms. 


Let us start with a massless force mediator like the pho- 
ton. In Chapter |.1 we showed that the photon field A,, has 
two transversal polarization states orthogonal to its propa- 
gation direction. It is important to know that this transver- 
sality has everything to do with the fact that the photon is 
massless and, as we have argued before, it is the gauge 
invariance that effectively removes one degree of freedom 
from the three-component ‘vector’ potential. It is indeed 
the gauge invariance that — so to speak — protects the 
masslessness of a gauge particle like the photon. To get 
massive it would need the extra (longitudinal) component 
which is just not there, basta! 


To continue our counting exercise, each component of the 
iso-vector field ; represents one field degree of freedom. 
independent of whether it is massive or massless. Sup- 
pose we take it to be a massive field, then after break- 
ing, we create two massless Goldstone degrees of free- 
dom while the third iso-component remains massive. Now 
comes the magic of the Higgs mechanism: the massless 
modes of the ¢ field get ‘eaten’ by the corresponding gauge 
particles, who become stante-pede massive after this ex- 
quisite meal. Because a massive vector field needs three 
polarization states, it has two transversal components like 
the photon, but also a longitudinal component, which the 
massless photon does not have. So, the upshot of the 
exercise is crystal clear: if we ‘break’ a gauge symmetry 
then the the forces in the unbroken group stay unchanged 
but the force mediating particles that correspond to the 
broken generators, become massive and therefore short- 
range. And they become massive by absorbing the would- 
be Goldstone modes, which consequently disappear from 
the spectrum. There are no massless Goldstone particles 
but instead we have two massive vector bosons! 


THE SYMMETRY BREAKING PARADIGM 


445 


And now, to finally answer the question that got us into all 
this counting in the first place, where is that Higgs particle? 
The answer can only be that that particle corresponds ex- 
actly to the single leftover massive degree of freedom, the 
third component of that iso-vector ® we started of with. So 
it is not the massless Goldstone degree of freedom that 
signals the breaking in this gauge symmetry setting, but 
the smoking gun is a neutral (it does not couple to sur- 
viving photon-like particle) massive scalar particle. What 
we learn is that the Higgs particle is not the condensate 
which gives the force carriers mass, but rather the quan- 
tized wave that rides on top of that condensate! It is a bit 
like having a transition from vapor to liquid water, which 
after the transition allows for waves propagate on the wa- 
ter surface. The degrees of freedom that acquire mass are 
the ones that have to wade through the water which makes 
them feel heavy indeed. The Higgs particle is the neces- 
sary a witness without alibi of this beautiful but intricate 
mechanism. The discovery of this unique feature that vin- 
dicates the BEH mechanism, a backbone of the Standard 
Model, by the ATLAS and CMS collaborations at CERN in 
2012 was therefore a landmark discovery. 


The mixing of weak and electromagnetic interactions. 
In the example above we have looked at the breaking of an 
SO(3) symmetry by a non-zero vacuum expectation value 
of an iso-vector or triplet field  , giving rise to masses for 
two of the three gauge fields. This is not quite the way 
the symmetry breaking works in the Standard Model. In 
the sector of the weak and electromagnetic interactions 
we have a gauge group SU(2) x U(1) involving the three 
gauge fields W+ and W° for the SU(2) , and a gauge field 
Y for the U(1) factor. This group is broken to a a resid- 
ual U(1),, corresponding to the massless photon. This 
can be achieved by a non-vanishing expectation value for a 
scalar field that transforms like a doublet under the SU(2) 
and is also charged with respect to the U(1)y field. The 
net effect is that one is left with three massive gauge par- 
ticles: the W= and the neutral Zo boson, which is a linear 
combination of the W; and Y fields. The other, orthogonal 


linear combination of those two neutral fields corresponds 
to the photon. This intricate mixing of symmetries shows 
reminds us of the fact that nature not always celebrates 
ultimate simplicity. 


A symmetry not broken, but hidden. The above account 
of the BEH mechanism can be criticized on valid grounds. 
It may even be called misleading. | used this narrative for 
pedagogical reasons, because it borrows some of the vo- 
cabulary of the global symmetry breaking scenario. But a 
deeper fact is that the vacuum expectation value as | dis- 
cussed it is gauge dependent. Because of the local gauge 
invariance, | can locally transform that vacuum vector in 
any direction | want, so the analogy with the phenomena 
of magnetization where that direction is directly observable 
and fixed is wrong. The good way to talk about the BEH 
mechanism is to say the the invariant square of the covari- 
ant derivative acquires a vacuum expectation value, which 
directly translates into the mass terms for the vector parti- 
cles. In other words there is a way of talking about this so- 
called breaking in a gauge invariant way. But then we have 
arrived at a contradictio in terminis, because if the mecha- 
nism can be cast in gauge invariant terms, then the gauge 
symmetry cannot be broken! Indeed! This is the reason 
that we rather speak of a hidden symmetry, the gauge in- 
variance is still present, but is no longer manifest in the 
physics (the mass degeneracy), it is hidden. It is better 
to say that the gauge symmetry is not really broken at all, 
but realized in a different way in this physical model. This 
point of view is strongly supported by the technical fact that 
there is not necessarily a real phase transition between the 
hidden and manifest symmetric (confining) phase of the 
system. 


Other forms of symmetry. We have in passing already 
referred to other symmetry types then the ones we have 
been considering here. 


An important extension of space-time symmetries to so- 
called supersymmetries was a remarkable achievement. 
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The related super-algebras are not of the Lie algebra type, 
because they also involve fermionic generators that obey 
anti-commutators. If these extended symmetries are made 
local by gauging them, you need to introduce a spin-3/2 
gravitino as the super partner of the graviton. As the names 
suggest these symmetries play a vital role in super string 
theory and super gravity theories and we commented on 
them at the end of Chapter |.4. The experimental program 
at the Large Hadron Collider at CERN has been search- 
ing for the lightest super particle that should exist in any 
supersymmetric theory with broken supersymmetry. And 
as we have not run into any superpartner of any particle in 
the Standard Model we have to assume that supersymme- 
try should be broken already at a high energy well above 
1TevV. 


Later, a remarkable class of algebra’s were discovered,: 
these are called infinite dimensional Lie algebras that are 
also known as Kac—Moody algebras. They have found in- 
teresting applications in two-dimensional physics both in 
string and condensed matter theory. It is a very high level 
of symmetry. After what we have said before one expects 
in this case there to be an infinite number of conserva- 
tion laws, which almost tantamount to saying that models 
in which they feature, in spite of being very nonlinear are 
basically exactly solvable. 


Finally there is a class of symmetries related to what we 
called topological phases in matter, which are called Hopf 
algebras or quantum groups. The remarkable aspect of 
their application in two-dimensional physics is that their 
representations describe both the ordinary excitations, and 
the topological defects and their dyonic mixtures called 
anyons. These correspond to the exotic particles we briefly 
described towards the end of the previous chapter . 


A detailed discussion of the symmetries we just mentioned 
is beyond the scope of this book, but we mention them to 
emphasize the richness of the symmetry concept in math- 
ematical physics. 


Symmetry concepts and terminology 


We have explored many aspects of the notion of symme- 
try in this chapter. First we searched for the observables 
Q; that commute with the Hamiltonian. These correspond 
with conserved quantities and form some Lie-algebra in- 
cluding the Hamiltonian H, which is then called the sym- 
metry algebra Q. The states of the system at some fixed 
value of the energy will form a degenerate set that cor- 
responds to certain representations of the symmetry alge- 
bra. The degenerate states can be labeled by the eigenval- 
ues of some mutually commuting subset of the symmetry 
generators, forming a so-called Cartan subalgebra H of 
the symmetry algebra. The choice of Cartan subalgebra 
corresponds to choosing a framework F. The other sym- 
metry operators that are not in the Cartan subalgebra can 
be combined into raising and lowering operators that walk 
you through the sample space of the chosen framework. In 
the following table we have summarized the corresponden- 
ce between the physical and mathematical concepts un- 
derlying the notion of symmetry. 
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Math: Group theory 


> Continuous symmetries C 


Physics: Quantum theory 


Hilbert space of states 
Algebra of observables 


Lie algebra A 


dim A=d 


Observables 

Hermitian 

Commutator algebra 
Infinitesimal transformations 
Invariant polynomials (Casimirs) 


(AVS tAB ek i=1...,d 
At=A 


[A,B] = iC 
Aahb) = iA hp) 
C (k= locos TNE A) CoA = 0 


Cartan subalgebra H 


dim H = rank A =r 


HCA & Framework F 

Mutually commuting (= Abelian) 

Labels basis states of representation N 
Weight vectors {Ain} 


r = looong GF H 
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hb) = dom CmiAm})N 
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Raising and lowering operators 


Symmetry algebra Q 


dim H < dim O < dimA 


Subalgebra Q C A 
All Qi commute with Hamiltonian Ho 
Qi; ~ conserved quantities 


Q; ~ generate symmetry transformations 
Time independent labeling of states 


{Qi} 
ne Hol =0 
o 


ifHEHSHCOQS> {Ai} c {q} 


Lie group G 


dim G = dim A 


Unitary reps 

Transformation group 

on Hilbert space Ho 

Finite transformations: G ~ e^ 
Group space coordinates 


ut = u~! 
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Chapter Ill.1 


The structural hierarchy of matter 


Collective behavior and 
the emergence of complexity 


The behavior of large and complex aggregates 
of elementary particles, it turns out, is not to be 
understood in terms of simple extrapolation of 
the properties of a few particles. Instead, at 
each level of complexity entirely new proper- 
ties appear and the understanding of the new 
behaviors requires research, | think, as funda- 
mental in its nature as any other. 


P.W. Anderson in More is different (1972) 


If we start from a large number of simple constituent par- 
ticles which have simple interactions with each other, the 
collective of such particles may well exhibit a rich struc- 
tural diversity and complexity. If we manage to identify the 
relevant collective degrees of freedom in the macroscopic 
system, then another simplicity may be regained, however. 
And relevance is what counts. This approach may reveal 
a hidden order and allow for an effective description of the 
apparent chaos and complexity in a limited number of vari- 
ables. 


Lost individuality. Let us start with a human analogy. 
Think of a couple, if they never talk to each other or seem 


to communicate, you'll treat them as separate individuals. 
You think of their ‘relation’ as a minor perturbation on their 
existence as individuals. However, if they are close and 
their relationship is a kind of symbiotic, you will treat the 
pair as a single entity: they are nice or crazy, or stupid. 
Their individuality is neither visible nor relevant it seems, 
what becomes relevant are the properties of the couple 
and these may be totally different from those of the individ- 
ual. 


Constituents and their interactions. The two cases rep- 
resent two different regimes, which you might call weakly 
or strongly coupled. In the strongly coupled regime the 
next question is how the couples interact with each other, 
because that will have decisive implications for the col- 
lective behavior of a large crowd of people. To under- 
stand collective behavior one has to have some insight in 
the different aggregation levels below, in what the relevant 
agents at various levels are and how they interact. Are 
they individuals, couples, families or communities? 


The differences in social organization between bees, ants, 
dolphins and humans can only be partially traced back to 
the difference in their specific species-linked features (for 
example the way their genetic information is passed over 
to the next generation) but to a large extent the social hi- 
erarchies they form depend on the nature of their interac- 
tions. 
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What’s in the air? 


= Nitrogen 
= Oxygen 


= Other (carbondioxyde, 
Water vapour, Argon) 


Figure III.1.2: What's up in the air? Air is a mixture of chemi- 
cals, and note that the nitrogen and oxygen components consist 
of the diatomic molecules N2 and O2. These atoms - like peo- 
ple — prefer to pair up somehow. 


External parameters. Yet, there are still other important 
factors that play a role. Given the properties of the relevant 
constituents and their interactions, there may be different 
ways society becomes organized. In general it will also 
depend on external ‘environmental’ factors and dynamics. 
Revolutions may take place where a society reorganizes 
itself rather drastically. Depending on the external param- 
eters it may go through ‘tipping points.’ A society may 
choose to adopt a new constitution, thereby redefining the 
basic set of behavioral rules. As external observer you 
usually don’t directly observe the constitution, rather what 
happens as a consequence of it. What you may see is that 
the collective behavior changes drastically. And you may 
wonder whether they changed the constitution or whether 
the reason was a financial crisis for example. 


What ’s (up) in the air? Similar questions arise in phys- 
ics if one wants to understand the binding of atoms into 
molecules or into macroscopic media like solids, liquids or 
gases. An everyday example is ordinary air: it is predom- 


inantly made up of the simple elements nitrogen and oxy- 
gen, and minor fractions of carbon, hydrogen and argon. 
But, in fact air is a mixture of chemical composites, since 
the nitrogen and oxygen have paired up (but for example 
not tripled up) while the others appear in composites like 
water vapor and carbon dioxide. Argon is the only element 
in the mixture perfectly happy on its own, an ideal Einzel- 
gdanger precisely because its electrons fill an entire shell of 
orbits, and this makes the atom inert, literally like a closed 
quantum shell. 


From physics to chemistry to biology to... Here we en- 
ter the vast domain of chemistry, and condensed forms of 
matter in general, including the modern material sciences, 
biochemistry and molecular biology. These fields of sci- 
ence concern mesoscopic or macroscopic systems, which 
are characterized by a specific hierarchy of aggregation 
levels. The actual structural outcome may drastically chan- 
ge depending on external factors like density, temperature 
and pressure. The system may go through a so-called 
phase transition, where it reconstitutes itself in a tumul- 
tuous way before ending up in a new stable lowest energy 
ground state that may be drastically different from the state 
it started out from. We all know that water molecules can 
manifest themselves collectively in many radically different 
guises such as vapor, liquid and ice, but also in alternative 
structures like raindrops, hail and a huge morphological 
variety of snowflakes. 


Emergent behavior. You can compare the ground state 
of a medium with what the constitution is for a human so- 
ciety. You do not observe it directly, only through the emer- 
gent behavior of the collective excitations it supports. The 
constitution is manifest in the way the society functions, or 
dysfunctions for that matter. It is the great variety in ways 
that matter has organized itself, which made it very hard 
to figure out what the constituents were in the first place. 
In this quest for ever more fundamental building blocks un- 
restrained reductionism reigned as we witnessed in Chap- 
ter 1.4. To provide a broader context for the main subject of 
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this book we will in the remainder of this chapter highlight 
some representative examples of structural hierarchies of 
ever increasing complexity. And these emergent hierar- 
chies are in some way or another the collective expression 
of the underlying quantum principles. 


The ascent of matter 


Cosmic evolution. The hierarchy of structures found in 
nature is quite universal. If we think bottom up, we start 
with the stable constituent particles of the Standard Model 
as depicted in Figure 1.4.35, in particular the up and down 
quarks, and the electron. From a history of science per- 
spective, working bottom up is anti-historical in the sense 
that the most basic constituents are the ones that have 
been discovered most recently, while many of the chemical 
compounds have been known for thousands of years. 


The reason to nevertheless work bottom up is because we 
know that that is the way matter has systematically built up 
in the early stages of our universe. Starting from the basic 
constituents that stepwise aggregate into complex struc- 
tures on large scales turns out to be the true historical ac- 
count after all. The universe cooled down in the course 
of its expansion. This means that thermal collisions be- 
tween constituents became less and less violent, so that 
ever weaker and more subtle binding mechanisms could 
become effective in forming increasingly complex stable 
structures. These structures emerged as a result of the 
the four basic interactions and because the external con- 
ditions like temperature and density kept changing. Let us 
go through some of the very early stages guided by the 
events marked in Figure III.1.3. 


The Planck and inflationary era. We discussed the very 
early stages of the universe in the section on Big Bang cos- 
mology on page 66 of Chapter!.2. The true origin of our 
universe is hidden behind the curtain of quantum gravity 


Planck ó 
Inflation d 
Baryogenesis o 


Figure III.1.3: Cosmic evolution. The figure shows the subse- 
quent phases of the early universe, exhibiting matter organizing 
itself in ever more complex structures. 


for which we do not have a satisfactory theory. That cur- 
tain obstructs our understanding of the universe for times 
smaller than the Planck time which is about 10~** s. So 
what the Big Bang really is we don’t know, but that such 
a dramatic event took place some 13.7 + 0.2 billion years 
ago is beyond doubt. This was established unequivocally 
from observing the aftermath of it. A first grand event is 
the period of cosmic inflation where our universe scaled up 
exponentially thereby generating an enormous amount of 
vacuum energy and making it homogeneous, isotropic and 
flat. The picture is that the latent vacuum energy of the in- 
flated universe was converted into all the (dark)matter and 
radiation that fill the universe today. 


Primordial baryogenesis. Shortly after the Big Bang the 
universe was presumably filled with the most basic forms 
of energy: a primordial soup! Matter in the form of quarks, 
leptons, their antiparticles and many types of radiation. 
The strong interactions were operative, however, the quarks 
and gluons were not in a confining phase, but in the quark- 
gluon plasma phase we mentioned on page 195 in Chap- 
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ter 1.4. A separate important question is the presence and 
role that dark matter may have played in the very early 
stages of the universe. This role strongly depends on what 
dark matter precisely is. What we know for sure is that it 
interacts very weakly with ordinary matter, and therefore it 
will not greatly affect the processes we will describe next. 
Ordinary matter and radiation are all interacting frequently 
enough to stay in equilibrium with each other. There is a 
simple rule, following on from special relativity that tells us 
that matter and anti-matter will recombine, and effectively 
annihilate each other if the temperature drops below twice 
the mass the particle type: kT < 2mc’. 


This in addition assumes that the density is large enough 
so that they will run into each other enough. Not much 
matter would be left if a slight asymmetry between matter 
over anti-matter did not develop at a very early stage, so 
that after the annihilation of all available anti-matter, a tiny 
surplus of matter (of 1 part in 10”) remained and that is all 
the ordinary matter present in our early universe. 


Primordial proton and neutron synthesis. When the 
universe was roughly 107° seconds old, the up and down 
quarks started binding into protons and neutrons due to 
the color force mediated by the gluon particles. The nu- 
cleon synthesis processes are 


u+u+d > p 
d+d+u > n 
e+p © n+v 


(11.1.1) 


In this phase the universe was basically filled with a plasma 
consisting of protons, neutrons and electrons, and radia- 
tion consisting of photons and neutrinos. 


Primordeal nucleosynthesis. After about 3 minutes the 
first nuclear fusion processes started to take place, the so- 
called primordial nucleosynthesis in which the lightest sta- 
ble nuclei were produced like “He , *He and tiny amounts 


of lithium (’Li) and beryllium (Be). The process stopped 
there, basically because there were no stable nuclei with 
a higher atomic number. The typical sequence of fusion 
steps ‘re: 


ptn > 2D (Deuterium) 
*D+p > ’He 
*D+n 3 °T (Tritium) 
?D+ D tHe (11.1.2) 
3T +fHe — li 
He +3 He + Be 


Note that the process proceeded via unstable intermedi- 
ates such as the hydrogen isotopes, deuterium and tritium, 
mostly ending up in stable “He nuclei. After the first fif- 
teen minutes the cosmic abundances settled to about 75% 
hydrogen (H = p) and 24% helium-4. The prediction of 
these primordial cosmic abundances was one of the im- 
portant successes of using quantum (nuclear) theory in 
the context of the early universe. Many others were to 
follow. 


Gravities opportunity: the seeds of large-scale struc- 
ture. Only after about 300,000 years the simplest atoms 
would form, meaning that the electrons would combine 
with the aforementioned nuclei to form electrically neutral 
atoms. At that point the universe was filled with a gas 
of neutral atoms. The photons decoupled, and the grav- 
itational force became dominant. Inhomogeneities corre- 
sponding to local maxima in the mass density of particles 
attracted other particles more strongly than the low density 
regions and therefore high density regions started to build 
up mass. From a gravitational point of view all masses 
attract each other, and the more mass the stronger the at- 
tractive force. This means that pockets where the energy 
density is more than average will grow. These early density 
inhomogeneities are the seeds of the large-scale structure 
in the universe. 
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Chemical Milky | Solar Earth Human 
element way | system crust 
H 73.90 70.57 0.14 10 
He 24.00 27.52 - - 
O 1.04 0. 59 46.00 65 
Cc 0.46 0.30 0.03 18 
Ne 0.13 0.15 - 
Fe 0.11 0.12 5.0 610 


Figure Ill.1.4: Carbon production. It is shown how carbon nu- 
clei were produced in the universe by successive fusion pro- 
cesses of “He inside stars. 


From stardust we are made. In the center of these ever 
denser clouds, pressure and temperature started building 
up locally reaching again high temperatures of millions of 
degrees. This gave rise to a next round of nuclear fusion 
processes. That is how slowly the diverse array of chem- 
ical elements in nature was created in the core of many 
generations of stars, and the stockpile of basic chemical 
elements, indispensable for the later chemistry of life, was 
built. The truth is that all of us are made of stardust! It 
is interesting to be aware of the fact that this process took 
billions of years because several generations of stars were 
needed to build the heavy nuclei. And the fact that our ex- 
panding universe has to be old explains why it is also big 
and cold. It has to be, otherwise we could not be there to 
observe it. What feels like an utter inhospitable environ- 
ment turns out to be necessary for life to be possible in the 
first place. 


We see from the periodic table that in principle by adding 
on He nuclei, elements like beryllium and the all-important 
carbon and oxygen can be reached, as indicated in Fig- 


Table Ill.1.1: Mass abundances. Abundances (in %) of some 
common chemical elements at different extraterrestrial and ter- 
restrial levels. 


ure III.1.4. For example: 


4He+*He > Be +y 
4He+*He+*He — Carbon +y 
2Carbon+*He 3 "Oxygen +y (11.1.3) 


The way carbon is synthesized is remarkable to say the 
least. The effectiveness of the processes above is due to 
a subtle resonance which amplifies the second process. It 
remains mysterious that on the one hand all of life is car- 
bon based, whereas the actual production of the carbon 
itself was a process depending on a delicate balance of 
values of the constants of nature. From this point of view 
one is tempted to conclude that life is a miraculous coinci- 
dence! 


In Table Ill.1.1 you see what happened to the original galac- 
tic abundances, like in our Milky Way, on their way to be- 
come tiny parts of our physical bodies. The explanation of 
how these changes came about goes beyond the scope of 
this book. 
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Molecular binding 


Atoms are electrically neutral because the positive charge 
of the nucleus is exactly cancelled by the negative charge 
of the electrons. Yet the charges are not exactly on top of 
each other so what you find if you go to short distances 
is that there are residual electromagnetic interactions (like 
dipolar forces) that become dominant. These residual in- 
teractions are to a large extent responsible for the fact that 
atoms bind in such a rich diversity of structures, be it mol- 
ecules of varying complexity, or solids, or other types of 
condensed states of matter. 


Repulsion versus attraction. Interactions are the mother 
of binding and binding is the father of structure. The se- 
cret of building spatially extended structures resides in the 
fact that the binding between atoms or molecules is the 
outcome of a delicate balance between a repulsive force 
that dominates at small distances and an attractive force 
that dominates at large distances. The typical behavior 
for the energy U of a pair of atoms as a function of their 
separation r is given in Figure lll.1.5. Understanding the 
curve is not hard. Imagine releasing a marble on the en- 
ergy curve, then starting at a small r it would roll away to 
large distances (that is the repulsive part of the interac- 
tion), but starting for large r it would role towards the origin 
(the attractive part). So, if the particle were to experience 
some friction then irrespective of where you start the mar- 
ble would always end at a separation r = ro, where the 
potential energy is minimal. This picture reminds us of the 
atomic binding of Figure 1.4.5 at least in a qualitative sense. 
We conclude that also in this domain stability is based on 
a compromise between attraction and repulsion. This is a 
feature underlying the formation of structure on most levels 
of complexity. 


Van der Waals binding. The basic attractive interatomic 
force is the Van der Waals force after the Dutch 1910 Nobel 
laureate Johannes Diderik van der Waals. It even works 


Figure III.1.5: The interatomic interaction potential. The inter- 
action potential of two hydrogen atoms as function of their dis- 
tance. For short distances the force is repulsive but for long dis- 
tances attractive. This behavior is a consequence of the sharing 
of electrons which implies that a negative charge cloud forms 
between the two positively charged nuclei. The minimal energy 
configuration is achieved for a distance ro. So free hydrogen 
spontaneously forms a gas of diatomic molecules H2. 


between two atoms that are called ‘inert’ like argon or neon. 
They have completely filled shells which means the charge 
cloud is spherical. However, if they get close these clouds 
become deformed and the molecule develops an (induced) 
dipole moment which just means that the resulting plus 
and minus charges have different spatial distributions. The 
induced dipole moments lead to a weak attractive force be- 
tween the atoms. It is weak because the interaction poten- 
tial drops off as ~ 1/r® that is much faster thus than the 
Coulomb potential (~ 1/r) between two opposite charges. 
On the other hand, if the atoms are attracted they can- 
not come too close because then the electron clouds start 
overlapping and that causes a strong repulsion and a steep 
rise of the potential for short distances (~ 1/r!*) . That re- 
pulsion is due to the Pauli principle which holds for the 
electrons: it provides a hard core for the interactions. This 
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potential is depicted in Figure Ill.1.5. At low temperatures 
the Van der Waals interactions may lead to the formation 
of a solid where all the atoms form a regular array, and the 
nuclei occupy the sites of a crystal lattice. 


Polar (or ion) binding. Atoms have a certain number 
of electrons which form a charge cloud around the nu- 
cleus. The electrons subsequently have to occupy differ- 
ent states that is why the charge clouds differ from atom to 
atom. Now for the chemistry of atoms for example which 
molecules they can form, the shape of the clouds is all- 
important. The number of valence electrons is the number 
of electrons in the highest unfilled shell. The tendency of 
atoms is that they like to fill their outer shell. They can 
do that basically in two ways: one is that they can pick up 
the electrons of another atom in which case the atom that 
gives away electrons becomes a positive ion and the one 
that takes extra electrons becomes a negative ion. The 
ions have the same old nucleus but have a net charge be- 
cause of an electron surplus or deficit. Clearly the ions 
made through this ‘social’ mechanism of giving and taking 
have opposite charges and will be attracted to each other 
because of the Coulomb force between them. But again, 
at small distances the repulsive interaction of the clouds 
takes over, and qualitative features of the picture of Fig- 
ure IIl.1.5 remain valid. 


A lot can be said based on the location of the atoms in the 
periodic table in particular the column they are in. Take 
the elements in the first column like hydrogen for exam- 
ple, they have one electron in the outer shell. As it hap- 
pens these atoms are actually quite social: they are will- 
ing to give away their electron and to turn into a positively 
charged ion. Complementary behavior is obserbed in cer- 
tain elements in the one but last column, like chloride (Cl), 
that like to receive an extra electron to fill their outer shell 
and turn into a negative ion. So indeed we see polar bind- 
ing between atoms in the first column and the one-but-last 
column. And we see many well-known elementary mole- 
cules like HCI (hydrochloric acid) and NaCl (kitchen salt) 


that are held together this way. 


Covalent binding. Simple atoms like hydrogen, oxygen or 
nitrogen, which are the main components of ordinary air, 
are bound in pairs. The question is how the pair-binding 
in the diatomic gases precisely comes about. How can it 
work because there are no ions to be formed? In these 
cases a different mechanism is operative that is also quite 
‘social’, as it is based on the notion of sharing. Once close 
enough, atoms can lower their energy by sharing outer 
electrons; they spread as it were their negative charge 
clouds over the two nuclei, by sharing electrons. The cloud 
is mostly concentrated between the nuclei and that means 
that these become attracted to the cloud and therefore to 
each other. The binding that results from this mechanism 
is called covalent binding. 


We have mentioned that what matters are the shapes of 
the charge clouds corresponding the outer (or valence) 
electron orbitals. They tell us a lot about the geometrical 
patterns of molecules and materials. On the other hand 
once we realize that the atoms are composites of nuclei 
and electrons and therefore by themselves complex ob- 
jects, we should not be surprised to learn that in the behav- 
ioral diversity they exhibit, much will depend on the details 
of the atoms in question. 


Hydrogen bonds. Once you know how atoms form mol- 
ecules there is the next step up, which is to understand 
how molecules bind with each other or in case they be- 
come large, how they interact with themselves to produce 
more and more elaborate molecular structures. Here one 
exploits more intricate mechanisms that will do the job. A 
well-known example of this is the so-called hydrogen bond 
that plays a vital role in organic chemistry and therefore 
also biochemistry. It is based on the idea that molecules, 
or parts of molecules, may also behave like electric dipoles 
and therefore lead to an attractive force. The term hydro- 
gen here refers to the fact that hydrogen, when it binds 
to a strong electronegative atom such as oxygen or nitro- 
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Figure Ill.1.6: Molecular shapes. We have depicted the spa- 
tial geometry of the atoms forming a molecule, and the charge 
clouds corresponding to the shared and lone electron pairs. 


gen, like in water, gives a polar molecule that binds through 
these hydrogen bonds. This type of binding is what keeps 
the water molecules together in the liquid, and it for exam- 
ple explains the relative high boiling temperature of water. 
The hydrogen bond is thus structurally similar to the Van 
der Waals force, but it is stronger. These bonds play a vi- 
tal role in understanding the spatial geometry of complex 
biomolecules. 


It’s all quantum plus electrodynamics. All this being 
said, | like to stress that all chemical binding mechanisms 
are a product of two fundamental ingredients. One is the 
set of underlying quantum principles as expressed by the 
Schrédinger equation, and the other set is formed by the 
laws of electromagnetism governing the forces between 
charges. It means that if — as is often done in practice 
— we were to put the constituents and their basic elec- 
tromagnetic interactions in the Schrödinger equation and 
let a powerful computer turn the crank we would generate 
the structures we observe. Such calculations show that 
the theory is correct and have great value for applications. 


They do however not replace or satisfy our need to under- 
stand the basic physical and chemical mechanisms. Sci- 
entists have introduced many so-called forces and effec- 
tive interactions and bonds, exactly because they provide 
a kind of elementary toolkit to effectively explain and pre- 
dict chemical behavior. But we should remember that all 
of those new forces are nothing but residual electromag- 
netic interactions between objects like atoms or molecules 
or chemical ‘groups’ that have intricate charge distributions 
determined by the laws of quantum theory. It’s all a matter 
of shapes and these shapes can be described as ‘multi- 
polar fields’ of which the dipole is the simplest example. 
The quantum laws are strong, accurate and universal, and 
even though they don’t allow us to understand all of chem- 
istry directly from first principles, they do allow us to com- 
prehend in detail the basic mechanisms that in a subtle 
balance give rise to the elaborate chemical structures we 
observe in nature. 


The miraculous manifestations of carbon 


The plug and play of organic chemistry. In this subsec- 
tion we take a closer look at the element carbon and the 
remarkable structures it can form all by itself, as displayed 
in Figures III.1.7 and Ill.1.8. We start simple and add more 
complexity along the way. 


The spatial geometry of simple molecules. Because 
the carbon atom sits in the fourth column of the periodic 
table, it has four valence electrons to share. Hydrogen has 
one to share so carbon can bind to four hydrogen atoms 
to form a methane CH, molecule, which as you probably 
know is a strong greenhouse gas molecule. Both atoms 
are happy because they made a perfect match in the one 
to four ratio. What about the other bad guy, carbon diox- 
ide CO2? Well, now the carbon shares two electron pairs 
with each of the oxygens to optimize its sharing strategy. 
And what about H20 , just innocent water? Well the oxy- 
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gen clearly shares one pair with each of the hydrogens 
and there are four non-paired electrons left on the oxy- 
gen. 


The next question that naturally arises is what do these 
molecules look like? Can we from the binding mecha- 
nism decide what the spatial configuration will be? For 
simple molecules this is indeed the case as is shown in 
Figure Ill.1.6. The resulting shape follows from the mutual 
repulsion of the negatively charged electron clouds, which 
try to avoid each other as much as possible. 


Shapes of simple molecules. So, for the methane or 
CH; it should not come as a surprise that it forms a per- 
fect tetrahedron with the carbon nucleus at the center and 
the hydrogen nuclei at the four corners. The clouds on the 
bonds indeed maximally avoid each other meaning that the 
bonds will make angles of 120 degrees. For CO; there are 
two double bonds and we expect a linear structure with 
the carbon nucleus in the middle right in between the two 
oxygens. A detail is that indeed a double bond defines a 
plane, The two double bonds mutually repel and therefore 
the plane connecting to the first oxygen will be perpendic- 
ular to that connecting to the second. And what about the 
water molecule H20O, is it also linear? Here there is an- 
other ingredient: the four leftover electrons of the oxygen 
form a cloud also attached to the oxygen. So in fact there 
are three clouds that will lie in a plane, and as the clouds 
are not identical the H2O molecule has a bent structure. 
The lone pairs tend to be bulkier and therefore push the 
peripheral atoms down so that the angle between them 
will to be smaller than in the symmetric case. That ex- 
plains why the two bonds to hydrogen make an angle, not 
of 120° but of about 104 degrees. 


Greenhouse gases. Carbon dioxide is made by burning 
carbon containing materials. It is an enormously useful 
chemical compound but the problem is that we have pro- 
duced and still produce far too much of it. It plays a haz- 
ardous role in our atmosphere as it is a greenhouse gas. 


This is the case, because molecules which have a cer- 
tain structural complexity (like carbon dioxide, methane, 
but also water vapor) have many low energy, oscillatory 
quantum mechanical modes in which they can absorb and 
(re)emit radiation. In particular, modes corresponding to 
heat radiation. So the heat that is coming from the Earth’s 
surface after being absorbed from the sun, or heat pro- 
duced by human activities, gets absorbed by the CO, blan- 
ket in the atmosphere, and then reemitted. But the reemis- 
sion is isotropic, meaning the same in all directions, and 
therefore half of the reemitted heat goes back to the earth 
and that is why the earth heats up. 


Photosynthesis. One way to get rid of CO, is through 
vegetation; plants absorb carbon dioxide from the air, and 
in a process called photosynthesis combine it with water 
and light (photons) from the sun to produce carbohydrates 
and the oxygen we need in a process which can be sum- 
marized as CO2 + H20 — [CH20] + O2. Water vapor 
in the air certainly does affect the greenhouse effect in 
that it increases the warming up caused by carbon diox- 
ide considerable. However water is engaged in all kinds 
of other climatological cycles like cloud formation and rain 
that make its role essentially different, the vapor concen- 
tration in the atmosphere changes by large amounts on a 
short scale of days or weeks. 


Carbohydrates. Once you realize that carbon has four 
binding sites available you realize that there are extremely 
diverse ways to combine these molecules Carbon is an 
ideal example of a basic building block. And nature learned 
to play with it. Imagine you start with a tetrahedral methane 
CH, molecule, and you replace one hydrogen by another 
carbon then that is also a compatible configuration. Con- 
tinuing this process two more steps you get the butane 
molecule of Figure Ill.1.7(a). It is evident thet carbohy- 
drates like CkHķę+2 actually can in principle form for any 
value of k. These molecules correspond to long linear 
chains. 
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(a) Carbon has the powerful property that it can form long linear chains (b) A polymer is a linear chain made up of identical units. 
with hydrogen atoms on the side. This is the highly flammable gas butane 
for example. 


(c) The common sugars or carbohydrates glucose (I) and fructose (r). (d) If you can make chains you also can make cycles without extra ingre- 
These have a chirality or handedness; there are two forms. The case _ dients. This is the benzene molecule Ce He featuring the famous hexag- 
where the bottom group is on the left or the right, is like a left or right onal ring structure with three double and three single bonds. 

shoe. They form mirror images that cannot be rotated into each other. 


Figure Ill.1.7: Miraculous carbon. Carbon plays a central role throughout organic chemistry. With its four bonds it is remarkably 
versatile and can make linear, planar or 3-dimensional structures. 
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Polymers. One can go one step further, and build long 
linear molecules that are repetitive. Such long chains of 
of identical or similar units are called polymers as shown 
in Figure Ill.1.7(b), and it is a world on its own, to design 
polymers in such a way that they exhibit dedicated chem- 
ical properties, with particular applications in mind. This 
is what a substantial part of the bulk chemical industry is 
about. 


Ring structures. There is not only the possibility of open 
carbon chains, you can also imagine the formation of cy- 
cles or closed chains like the so-called benzene ring Ce 
which nature discovered and used over and over again. 
Ring structures like cyclopentane, cyclohexane and their 
polygon shaped relatives play an important role in the bio- 
chemistry of the base pairs in DNA and also in the amino 
acids from which the proteins are built. Furthermore, they 
are ‘bread and butter’ for the chemical and food indus- 
tries. 


Nano physics 


Nano science. Carbon composites don’t stop in the one- 
dimensional world of chains and cycles. Nothing keeps it 
from engaging in three valent bindings, meaning that a C 
atom has not just two C neighbors, but three that form an 
equilateral triangle. Such a connection opens the possibil- 
ity of making two-dimensional structures with the topology 
of planes, tubes and balls, and two-dimensional surfaces 
that have holes in them, the simplest one being the torus 
or donut. 


Mesoscopics. With the carbon structures we just men- 
tioned we enter the unfolding world of nano-science and 
technology, where one is dealing with molecular structures 
on anano scale, so typically involving up to a few hundred 
atoms. This domain is also called mesoscopic, just in be- 
tween the macroscopic and microcosmic worlds. 


Nature’s LEGO. Every parent remembers the thrill of what 
happens after you hand a group of playful children a big 
box of the most basic LEGO pieces. It is amazing what 
kind of stable and metastable structures they come up 
with. In this sense evolution is like a room full of chil- 
dren with an overdose of LEGO pieces, and once you re- 
alize that, those elaborate carbon structures become little 
more than the inevitable outcome of a childlike but power- 
ful methodology called trial and error. 


Buckyballs. A most remarkable discovery was the buck- 
yball or Cso gigantic molecule that is spherical rather than 
linear and made up of alternating pentagons and hexagons 
(see Figure Ill.1.8(a)). It was predicted by theoretical cal- 
culations to be extremely stable. Such large carbon mole- 
cules (not only C¢o but actually a whole range going from 
C4o to maybe C49) are now called fullerenes. This name 
refers to Buckminster Fuller, the American architect who 
pioneered the design and constructions of geodesic domes. 


Nano tubes. Closely related are the nano-tubes depicted 
in Figure IIl.1.8(6) which have attracted a massive amount 
of attention because of their many potential applications. 
These tubes are thin: the smallest have a diameter of only 
a few nanometers. This makes them extremely strong in 
proportion to their weight. Large nano-tubes are hard to 
make and this has so far hampered their large-scale appli- 
cation in technology. Let us finally mention the materials 
that are only made from carbon atoms. 


Diamond and graphite. As each C atom has four C neigh- 
bors, naturally located at the corners of a tetrahedron, it 
allows for the formation of wonderful three-dimensional lat- 
tices. One of those is quite exquisite indeed, because 
it is the diamond lattice. Diamond is pure carbon in a 
splendid guise, as it is extremely hard, highly transparent 
and very expensive. Diamond has relatively high density 
(3.5 g/cm’), does not conduct heat or electricity and is 
insoluble in any solvent. 
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(a) The football shaped Cso molecule is an example of a fullerene after (b) A carbon nanotube. 
Richard Buckminster Fuller, the architect and pioneer in designing and 
building of geodesic domes. 


(c) The structure of the covalent diamond lattice made with carbon atoms (d) Amazing graphene: only one molecule thick, and yet the strongest 
on all sites. planar material. It is also transparent and an excellent conductor. 


Figure III.1.8: Carbon structures. Some of the miraculous manifestations of carbon that all manifestly exploit the hexagon as basic 
building block. 
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Are there other three-dimensional carbon structures pos- 
sible? Yes, there is one, much more common than dia- 
mond, and that is graphite, the stuff that sits in your pencil 
and makes drawing so easy because it is totally opaque 
(black), soft and cheap as well. These properties follow 
from the fact that graphite forms easily, it corresponds to a 
stack of two-dimensional honeycomb planes that are rela- 
tively weakly bound. Graphite is soft and greasy, it is rel- 
atively light (2.5 g/cm*), a good conductor of heat end 
electricity and is soluble in most solvents. How different 
can members of one family be! 


Graphene. Let us finally mention the recently discovered 
miraculous material called graphene; this is a perfect two- 
dimensional hexagonal honeycomb sheet which turns out 
to be extremely strong in spite of being only a single atomic 
layer (see Figure Ill.1.8(d)). It is furthermore transparent 
and has high thermal and electric conductivity. This highly 
unusual combination of qualities singles this material out 
for many exceptional applications in the future, varying from 
wearable electronics and displays to fancy wrapping mate- 
rials. It may strike you that the structure is just like a single 
layer of graphite. The story goes that the Russian physicist 
Andre Geim and his student Konstantin Novoselov who re- 
ceived the Nobel prize for their groundbreaking work on 
graphene in 2010 made the first specimen just drawing 
with a pencil on the sticky side of sellotape. 


The molecules of life 


The pinnacles of molecular structure are the molecules of 
life such as nucleic acids and proteins. It seems some- 
what far-fetched to present these in an elementary book 
on quantum theory. The reason | do is that the structural 
hierarchy, as far as single molecules are concerned, really 
ends right there. And these structures are basically dic- 
tated by quantum theory. Therefore including them gives 
our review of the molecular hierarchy a sense of complete- 


Thymine 


Adenine 


5’ end 
b a 


Phosphate- 
deoxyribose” 


backbone 


Figure IIl.1.9: The chemical composition of DNA. A fragment of 
the double-stranded DNA molecule. The picture also gives the 
molecular structure of the base molecules with the four letter 
code assigned to them. The four letters A, T, G, C are strictly 
paired as A — T and G — C. The pairs are relatively weakly 
bound by hydrogen bonds indicated by the dotted lines. The 
DNA of the human genome contains about 3 billion base pairs, 
which contain among other things the genes that encode about 
20,000 proteins. (Source: Wikipedia) 


ness. Let us therefore briefly summarize some structural 
aspects and not talk about the functional part. As a mat- 
ter of fact the real tasks in the living cell are mostly per- 
formed by complex networks of proteins, and that is a level 
of emergence that transcends the one fully fixed by the 
basic laws of physics. 


The complexity of biomolecules is relative in the sense that 
again it is a structural level in which a limited number of 
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particular building blocks are used over and over again. 
Nature is brilliant in figuring out ingenious ways to apply a 
given structural element in many different ways. The struc- 
ture of biomolecules is modular and the huge diversity is 
not as much in the variety of constituents, as it is in the 
way they are put together on a modular level. 


The DNA molecule. A well-known example is the DNA 
molecule which is made of tens of billions of atoms. But its 
structure is highly repetitive so that one only has to show 
a little piece to see and understand what the building prin- 
ciples are. And once the architecture of the molecule is 
understood it is not so hard to explain the way it functions 
either. The structure of the molecule was discovered in 
1953 by Francis Crick, James D. Watson at Cambridge 
University and Rosalind Franklin at King’s College London. 
The Nobel prize for Physiology or Medicine was in 1962 
awarded to the first two and Maurice Wilkins, a collabora- 
tor of Rosalind Franklin in London. 


We have illustrated a small segment of the molecule in 
Figure Ill.1.9, and it is clear that the molecule features two 
long strands that are kept together with hydrogen bonds to 
make a sort of ladder. The stiles of the ladder are just a 
backbone of some sugar that repeats itself some three bil- 
lion times. The rungs of the ladder are made of pairs of nu- 
cleobases, of which there are only four, called adenine (A), 
thymine (T), guanine (G) and cytosine (C). It is the order in 
which these four type of rungs appear in the ladder which 
encodes the heritable traits of living oganisms. There is 
a strict pairing namely A always comes with T and G al- 
ways with C, so if you know the left half of DNA it is easy 
to construct the complementary right half of the molecule. 
And it is this deterministic feature that allows us to under- 
stand how the heritable information can be reproduced af- 
ter the cell division where the DNA molecule splits and the 
left and right half move to the two different daughter cells, 
which then are completed by synthesizing the complemen- 
tary half within the daughter cell. The chemistry is in fact 
rather simple but extremely effective. If you think of the ge- 
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Figure III.1.10: Amino acids. The generic structure of an amino 
acid, with its amino and carboxyl groups. In the center is a spe- 
cific group that characterizes the particular amino acid. Proteins 
are basically linear chains of amino acids. 


netic information stored in DNA as a piece of text written 
in a four letter alphabet of some 3 billion letters long, then 
that would maximally amount to N = 43 000 000 000 possi- 
bilities, which corresponds to six billion (= 7log N) bits 
of information. That amount of data would easily fit on a 
DVD or USB stick, in fact a good deal less because most of 
the information is highly repetitive and not conserved at all 
and therefore believed not to be that important. Yet as we 
are talking about important hereditary data, we should re- 
alize that the same DVD is sitting in every nucleus of every 
cell of our body — you should imagine that you are carrying 
around trillions of backups of your genome. | must admit 
that it makes me feel some kind of important, The DVD of 
my personal ‘feel good’ movie is not for sale but neverthe- 
less made in huge quantities. This is how the discovery of 
a deep secret of life ended up being a little more than a 
paean to painstaking reductionism. 


Translation of DNA information to protein structure. 
DNA is crucial for the organism but it doesn’t do very much, 
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from a chemical point of view it is not very active. It func- 
tions as a template from which the data corresponding to 
a gene are transcribed by RNA molecules that also carry 
it outside the nucleus of the cell where the instructions are 
then performed by ribosomes (some enzyme) to translate 
the four-letter code sequence of the genes as a sequence 
of three-letter codons. A codon encodes for a specific 
amino acid and the codons therefore form the genetic code 
The ribosomes produce from that sequence of codons a 
linear chain of amino acids corresponding to a specific 
protein. This process is schematically represented in Fig- 
ure Ill.1.11. The number of different amino acids that can 
be encoded by a three-letter codon (word) with the four- 
letter alphabet, can never be larger than 4° = 64. In fact 
there are only twenty-one of them but most of them are 
represented by several different codons. This redundancy 
makes protein synthesis more fault tolerant against copy- 
ing errors. 


To make the structural hierarchy explicit and complete, | 
have displayed the generic structure of the amino acids in 
Figure Ill.1.10. Because of their modular structure they are 
in fact quite similar, consisting of an amino and carboxyl 
group and a specific variable group in the center. This 
group may contain five and six cycles and combinations 
thereof, somewhat similar to what we saw in the DNA seg- 
ment of Figure lll.1.9. A protein is just a linear sequence 
of amino acids that may run from ten to hundreds for small 
genes to hundreds of thousands for the big ones. And 
because of their characteristic charge distributions these 
proteins start to fold up in all kinds of interesting ways, as 
schematically indicated in Figure IIl.1.12. This is called the 
secondary structure, where one distinguishes so-called « 
helices and B sheets and simpler strings in between such 
as turns or coils. The helices are curled up and the sheets 
are more planar again with two strands bound by hydro- 
gen bonds. The helices and sheets making up the protein 
are then again folded in characteristic ways into compli- 
cated and beautiful three-dimensional geometrical struc- 
tures (see the rather random selection in Figure Ill.1.13). 


And again it turns out that their shapes determine to a 
large extent what biological functions the protein can per- 
form. 


Curling up. We should be aware of the fact that the gar- 
gantuan DNA molecule, which has a typical length say of 
3 billion times a few nanometers (= 107? m) equals some 
meters, apparently fits in a cell nucleus with a typical size 
of 10 micrometers ( = 10~° m). This fact implies that na- 
ture must have developed some very clever folding tricks 
to make this possible. This is a generic feature of the big 
molecules of life, they are folded up in smart and elegant 
ways, and the way they are, usually tells us a lot about 
the biological function they may perform. DNA for exam- 
ple is curled up in different levels, first in small curls, then 
the curled up molecule curls up once more and then again 
etc.. Similar to what certain phone cords do when you 
don’t want them to. But to read the code corresponding to 
agene, the corresponding part of the DNA molecule has to 
be made accessible, i.e. certain genes have to be ‘turned 
on’, depending on what is needed in that particular cell at 
that time and place. 


Epigenetics. At this point we enter the domain of epi- 
genetics where one tries to understand how the gene ex- 
pression in the organism is exactly regulated by means of 
other chemical mechanisms using histones and methyla- 
tion. There are indications that also the methylation of the 
DNA is conserved, which means that it is somehow en- 
coded in the DNA. It has been suggested to add a fifth 
letter to mark its positions along the molecule. Unsurpris- 
ingly, several meta-levels of regulation are operative to get 
from the genotype of the organism to the phenotype, to 
get from our DNA to who we are as an integrated being. 
Whether the development of an organism is primarily na- 
ture or nurture, chemistry is the language in which the ex- 
planation will ultimately be cast. 


Conclusion. In this chapter we have shown how the com- 
plex hierarchy of matter came into being during the early 
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Figure Ill.1.11: From DNA to proteins. A schematic of how the linear four-letter code of DNA strand gets translated into a linear 
sequence of amino acids that form a protein. The four-letter code is copied on a single strand RNA. After splicing, which means cutting 
and copying the various pieces of the gene to a single sequence on a messenger RNA molecule, the messenger goes outside the 
nucleus of the cell. There the letter sequence is translated by Ribosome enzymes and the protein is synthesized. Each subsequent 
three letter sequence (called a codon) from the RNA gets translated into one of twenty-one amino acids, see Figure Ill.1.10. 
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Primary structure Secondary structure 
Amino acids 


(a) Primary structure as a linear chain of amino acids (b) Secondary structure with alpha helices and beta sheets. 


Tertiary structure Quaternary structure 


(c) Tertiary structure. The spatial structure consisting of folded helices (d) Quaternary structure, representing a protein complex such as in this 
and planes. case haemoglobin. 


Figure Ill.1.12: Protein structure. The four levels of protein structure. 
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Figure Ill.1.13: Proteins: the work horses of life. Their tertiary three-dimensional structural complexity, diversity and beauty is where 
the quantum ladder reaches into the heart of life. One could easily imagine trendy fashion designers and hair stylists getting inspiration 
from these magnificent — all natural — dreadlock designs. For others it is just a splendid paean to reductionism. 
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stages of cosmic evolution. We have described the won- 
derful diversity that the flexibility of the carbon atom allows 
for and that is not only evident in the field of nano-science, 
but also in biochemistry and molecular biology. We have 
given examples of how nature has exploited the almost un- 
limited possibilities to create tremendous diversity from a 
very limited set of fundamental building blocks. 


Further reading. 
On molecular physics: 


— Molecular Quantum Mechanics 
Peter W. Atkins and Ronald S. Friedman 
Oxford University Press (2010) 


— Molecular Physics: Theoretical Principles and Ex- 
perimental Methods 


Wolfgang Demtréder 
Wiley (2005) 


— The Molecules of Life 
John Kuriyan 
Garland Publishers (2012) 


Complementary reading: 


— The First Three Minutes: A Modern View of the 
Origin of the Universe 
Steven Weinberg 
Basic Books (1977) 


— What is Life? 
Erwin Schrödinger 
Cambridge University Press (1992) 


— The Double Helix 
James D. Watson 
Signet Books (1969) 


Chapter Ill.2 


The splendid diversity of condensed matter 


Water waves are called an emergent phenomena, 
because they are a property of the medium water 
but not of the individual water molecules. Emer- 
gent properties, which are ubiquitous in any form 
of collective, result from the combination of con- 
stituent properties and the nature of their interac- 
tions. 


Condensed states of matter 


Condensed matter physics is a research field with a wide 
scope, because there is a rich diversity of condensed states 
of matter that we have learned to distinguish and under- 
stand. Condensed matter systems are composed of large 
numbers of constituent particles or agents of various types, 
each with its own characteristics. When these particles are 
interacting all kind of unexpected things may happen, and 
their collective will exhibit a variety of emergent properties. 
This raises a question that can be posed in two directions. 
On the one hand we may start from the observed macro- 
scopic behavior and ask what the microscopic ingredients 
and mechanisms are that give rise to that collective behav- 
ior. On the other hand the microscopic constituents may 
be given and we are asked to ‘design’ a ‘medium’ that ex- 
hibits certain macroscopic properties. Condensed matter 
physics is the systematic study of widely different manifes- 
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Figure III.2.1: A science of complexity. Condensed states of 
matter are studied in the three basic disciplines, and in the inter- 
and transdisciplinary fields that emerged in between those dis- 
ciplines. 


tations of order and disorder. It wants to understand what 
characterizes the different phases and what the underly- 
ing mechanisms are. We start this chapter with an intro- 
ductory overview of some general concepts and will then 
focus on specific systems in the following sections. The 
next chapter is devoted to the properties of the electrons 
in solids. 
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A multidisciplinary field. The study of condensed states 
of matter is by no means an activity only physicists are 
concerned with. Quite the contrary, it is an inter- or better 
transdisciplinary field, where the basic disciplines of biol- 
ogy, chemistry and physics, as well as other, interpolating 
fields, meet and inspire each other in many ways. This 
research environment is sketched in Figure Ill.2.1. Gener- 
ally speaking the understanding of collective — often emer- 
gent — behavior, of large numbers of similar constituents or 
agents, is a principal objective of what is called complexity 
science. But the interactions typically go both ways; from 
individual to collective and back, from local to global and 
back. Characteristic for such systems is that they feature 
a variety of feedback mechanisms whose effects are noto- 
riously hard to understand and model. The models and 
methodologies developed in statistical physics and con- 
densed matter theory, offer possibilities for adaptation in a 
much broader context of complexity science — where they 
have demonstrated to be applicable in disciplines like eco- 
nomics, and other social sciences. Especially with the ad- 
vent of large-scale computation, which allows large-scale 
data processing and model simulation (including the non- 
linearities representing feedback mechanisms), these par- 
allels can be explored quantitatively. 


Just H20. Let me start with the familiar example of water. 
In Figure III.2.2 | have schematically displayed the different 
phases that can occur as a function of the temperature T . 
If we start in the middle, say at room temperature, and a 
normal pressure of one atmosphere, then it will be a liquid. 
If we heat it, it starts boiling at 100° C, and will make a 
transition to the vapor or gas state. And if we cool it, it will 
freeze and become ice. These phases differ by the way 
the molecules are aggregated. 


Collective behavior. In discussing collective behavior we 
distinguish a number of conceptual ingredients which we 
will briefly highlight in this section. On the one hand we 
have to know what the basic ingredients, often called con- 
stituents or agents, of which the system is composed, are. 
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Figure IlI.2.2: Collective behavior becomes less predictable 
and harder to understand if we keep lowering the temperature. 
(Source: Nobel.org) 


It is important to know what their individual properties or 
internal degrees of freedom are, but also what their inter- 
actions look like. On the other hand we have to determine 
what the possible external contro! parameters are, in the 
context of physics these are typically things like tempera- 
ture, pressure and external fields. 

The system may have different ways to aggregate, de- 
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pending on the ‘environment’ and as a consequence enter 
different phases. We fix the environmental constraints by 
choosing the values of the external parameters in certain 
ranges. These external parameters already refer to macro- 
scopic, that is, collective state variables. The temperature 
of a gas or liquid for example is linked to the average ki- 
netic energy of the molecules, and can be regulated by 
putting the system in contact with a heat bath. 


We are led to the notion of a phase diagram, where we 
draw the space of external parameters (or lower dimensio- 
nal cross-sections thereof), and divide it into the domains 
corresponding to the different allowed phases. 


Moving through parameter space one encounters bound- 
aries that separate different phases, meaning that the sys- 
tem will go through a phase transition. The phases will 
exhibit different degrees of order and disorder on different 
levels. The question how to distinguish the various phases 
leads us to the notions of order parameters and correlation 
functions. 


Finally, once a phase has been recognized, we have to 
identify the most relevant effective degrees of freedom of 
the system in that phase, these are generally emergent 
degrees of freedom which do not exist on the constituent 
level. On the one hand these are the low energy modes 
corresponding to so-called quasi-particles. You may for 
example think of density waves in a solid which are also 
called phonons or ‘particles of sound’. On the other hand 
in macroscopic media one often encounters so-called de- 
fects, these are literally structural defects or imperfections 
in the medium. Defects can be localized (point like) or ex- 
tended (like a line or a wall). Defects are robust for topolog- 
ical reasons, and they play a crucial role for understanding 
the properties of such materials. For example in a crystal 
one may have lattice defects, called dislocations or discli- 
nations, as we will show later on. 


Let us now zoom in on the concepts we just introduced. 


Constituents and their degrees of freedom. When talk- 
ing about condensed states of matter, we assume such 
states to be composed of many constituents. The con- 
stituents can themselves be composite as well, like ions, 
atoms or molecules. The constituents have certain prop- 
erties like mass, charge, magnetic moments (spins), in 
fact any the attributes we have been discussing in previ- 
ous chapters. The constituents will — depending on their 
properties — have interactions, and these interactions may 
be strong or weak, and may be long, short or intermediate 
ranged. For example if particles have spin one-half they 
are fermions and cannot occupy the same state, which 
has a huge impact on their collective behavior. Relevant 
is also to what extent the intrinsic degrees of freedom can 
be manipulated by external controls, like an applied mag- 
netic field for example, which couples to all individual spins 
in the system. Needless to say that it is precisely the 
rich variety of constituents and their interactions (including 
feedbacks) that allow for the splendid diversity of possible 
states and phases of condensed matter. 


In Figure III.2.3 | have indicated the substructures of the 
most common systems and their typical degrees of free- 
dom which may or may not play a decisive role, depend- 
ing on the question one is addressing. If we go down 
in scale the substance may consist of one type or var- 
ious types of molecules, and much will depend on the 
shapes of the molecules, referring to the charge distri- 
butions (the molecular wave functions). These determine 
the electric and magnetic dipole and higher moments, and 
as the molecules are overall charge neutral, these mo- 
ments are crucial and determine the rigidity of the individ- 
ual shapes. And clearly these shapes are all-important for 
understanding how the molecules can fit together in a sta- 
ble way, which in turn determines the allowed symmetries 
of a crystal to be formed. If the molecules become large, 
like polymers for example, one can imagine complex ma- 
terials being assembled, like biological tissues made from 
large biomolecules. 
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Figure IIl.2.3: A hierarchy of degrees of freedom. Building 
blocks of a condensed matter system (in white) and their ‘de- 
grees of freedom’ (in blue). 


The relevant constituents may also just be atoms, and they 
may form crystals, where they optimally balance their ki- 
netic and potential energy, or alternatively their attraction 
and repulsion. The picture is that the nuclei sit on the sites 
of a lattice and the electron states may either be localized 
on the nuclei, or be spread out and extended. The elec- 
trons in the outer shell — so-called valence electrons — are 
relatively weakly bound and can hop to neighboring sites 
and in the case we are dealing with a conductor, they even 
have non-localized states that spread out over the whole 
lattice. So, the material is a highly ordered solid, but hid- 
den in there are the electrons which form a freely stream- 
ing (not-ordered) fluid supported by the solid substrate of 
highly localized ions. Similarly we may have a solid where, 
say, the atomic spins are ordered, in which case we have 
a ferro or anti-ferromagnet, or the spins may be disordered 
— pointing in random directions — and there would be no 
overall magnetization. And indeed, these charge and spin 
degrees of freedom can be manipulated by imposing ex- 
ternal electric or magnetic fields. 


Control parameters and phase diagrams. 


An important remark is that the ‘relevant degrees of free- 
dom’ of the system as a whole are not known a priori, ex- 
actly because they will mostly be emergent such as sound, 
spin waves, currents, defects etc. These emergent de- 
grees of freedom will strongly depend on the choices we 
make for the external parameters. These are for exam- 
ple the thermodynamic parameters such as temperature, 
pressure or chemical potential. Other parameters corre- 
spond to external electric and magnetic fields, or the chem- 
ical composition (or doping) of the material. Moreover, 
there is a dependence on the dynamic of preparation. If 
we cool a liquid rapidly (called quenching), then it may not 
have had enough time to achieve the optimal type of long 
range order. It would stay somewhat amorphous, in con- 
trast with the perfect crystal which forms if we cool the liq- 
uid down slowly (called annealing). 


There are still other options for manipulating the system. 
You may change the relative concentrations of components. 
You may replace certain components by similar, or not so 
similar ones. You can add components (like so/vents or 
interstitials), or ‘dope’ the system by adding or removing 
charge carriers. These tools have been used in the most 
inventive ways to engineer materials with specific, some- 
times most unusual, but highly desirable properties. This 
advanced form of ‘legoism’ makes certain corners of ma- 
terial science look like a kind of black magic: a form of 
witchcraft with the distinctive feature that it works! 


The phase diagram. The parameter space may be di- 
vided into domains corresponding to the different phases, 
and this information is usually represented in a phase dia- 
gram. Often we are interested only in particular phenom- 
ena and we can restrict ourselves to smaller- and lower- 
dimensional cross sections of the parameter space. One 
axis that is usually present is the temperature (or energy) 
axis, and another is for example the pressure (or density) 
axis. If we add the pressure P, we can extend the Fig- 
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Figure Ill.2.4: Phase diagram. The standard phase diagram of 
ice/water/vapor with the triple point, and the standard definition 
of boiling and freezing point. Above the critical point there is a 
smooth crossover from the liquid to the vapor state. 


ure III.2.2 to the two-dimensional P — T phase diagram of 
Figure III.2.4. This adds novel features: the normal boiling 
and freezing points become lines, and as we see, these 
lines may join (or split) at a so-called triple point. Fur- 
thermore a line may terminate at a so-called critical point, 
where a clear distinction between the phases ceases to 
exist. 


Equation of state. The state variables are usually not in- 
dependent, since they have to satisfy a constraint, which 
is called the equation of state. For a fixed amount of stuff, 
say one mol, which means a total number of Na mole- 
cules, one finds that in the diluted gas phase for example 
the ‘ideal gas law’ holds. This law states that PV = RT, 
which is a functional constraint on the macroscopic state 
variables P, V and T involving the universal or molar gas 
constant R = Nak which is just a fixed number (the prod- 
uct of Avogadro’s and Boltzmann’s constants). If we for 
example consider a fixed amount of gas in a container of 
fixed volume V , the equation tells us that lowering the tem- 


perature would lower the pressure proportionally (at least 
in a lower right-hand side corner of the diagram where the 
‘law’ holds). 


Phase transitions. Crossing a phase boundary in a phase 
diagram means that the system goes through a phase tran- 
sition. Let us for a moment look at the dark blue line sep- 
arating the liquid and gas or vapor phases. Crossing that 
line from blue to light brown means boiling the liquid. What 
you immediately see is that this may happen on any point 
on that line segment. If we boil an egg on a Sunday morn- 
ing, what we do is that we have a fixed normal pressure 
of 1 atmosphere, and by heating the water we move to 
the right on the dashed red line until we hit the transition 
point at 100 degrees Celsius. But a less practical way to 
boil an egg would be start at high pressure with water at 
100° C, the water is not boiling then but when we lower the 
pressure, sure enough when it hits 1 atmosphere the water 
would start boiling. This boiling process would correspond 
to crossing the phase boundary top down along the verti- 
cal dashed red line starting at the high red point moving 
to the pink straight below. High in the mountains the pres- 
sure of the atmosphere is lower and thus water boils at a 
lower temperature (about 4 degrees per kilometer eleva- 
tion), which can make preparing your soft-boiled Sunday 
morning egg quite a hassle. Often phase transitions sig- 
nal the occurrence of a tipping point in some (free) energy 
landscape of the system due to changes in the control pa- 
rameters. And in that sense the phase diagram is a nat- 
ural characterization for any multi-particle or multi-agent 
system. 


Critical points. In a critical point, a phase separation line 
terminates. This means that the clear distinction between 
the two phases, and the marked transition between them, 
somehow disappears. We enter a critical region in which 
there is a smooth crossover between — in this case — the 
liquid and the vapor. In fact the usual clear surface sepa- 
rating them disappears and becomes a foggy layer. 
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Figure IIl.2.5: A tabular iceberg. In October 2018 a NASA in- 
spection team discovered this huge, perfectly rectangular, so- 


called tabular iceberg in the arctic. Such bergs are formed 
naturally and have a strikingly rectangular geometry, reflect- 
ing the underlying crystal structure. They are not single giant 
monocrystals, though they look like it. (Source: NASA ICE) 


Ice? What ice? In Figure Ill.2.6 we Souyrce:show a tiny 
corner of the phase diagram of water at very high pres- 
sures, and therefore not present in Figure Ill.2.4. It would 
have appeared high up on the left, in the direction where 
the arrow is pointing. The diagram shows that if you make 
the pressure large enough, the water will become solid 
even at higher temperatures. You furthermore see that 
there are actually many distinct solid phases up there. They 
are forms of ice that differ by their crystal symmetries. 
Some are hexagonal (I) others tetragonal (IIl,V|), mono- 
clinic (V), rhombohedral (ll) or cubic (not in the graph). 
A true Baskin & Robbins of structures, but — | am sure 
— all equally tasteless. Furthermore many of these fancy 
phases are metastable, so they tend to decay in more sta- 
ble versions. Note also the impressive number of triple 
points in the phase diagram. Such is the hidden diversity 
of something as common as water. It shows its complex 
behavior only under extreme conditions. 


6000 | 
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Figure III.2.6: /ce varieties. In the high pressure regime, high 
up along the vertical axis of the previous diagram, there are 
many distinct solid phases of water, where the water molecules 
happen to organize according to different symmetries. 


Water versus Argon. It is interesting to compare the fea- 
tures of the phase diagram of water, with for example that 
of the noble element argon. The element *°Ar has 18 pro- 
tons, and its 18 electrons completely fill all energy levels 
up to the n = 3, l = 1 shell of atomic states. These 
completely filled shells make the element stable and re- 
sistant to bonding to any companion. The noble elements 
are ‘Einzelgangers’, or ‘lonely cowboys’ so to say, they ap- 
parently have everything they need, and are like extreme 
individualists who love to ignore their neighbors. Under 
normal conditions it is an inert gas, and it has a phase di- 
agram similar to that of water as depicted in Figure III.2.4, 
though the corresponding points are positioned at different 
locations. As is clear from Table Ill.2.1, for argon things 
happen at much lower temperatures, which indeed is a 
consequence of their ‘nobility. 


If you would continue the phase diagram for argon to high 
pressures, one would surely see the melting line bend over 
to higher temperatures, meaning that liquid argon would 
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Table III.2.1: Comparison of water and argon. 


Phase diagram Water Argon 


TIK] Platm] |} T[K] Platm] 
273.15 1 83.81 1 


Melting point 


Boiling point 373.15 1 87.30 1 
Triple point 273.16 0.006 | 83.81 0.68 
Critical point | 647.10 217.7 | 150.69 48.0 


just like water solidify at very high pressures. This, how- 
ever, only happens at pressures of tens of thousands of 
atmospheres! Furthermore, because this simple atom has 
so few degrees of freedom, it exhibits only one solid phase. 
This means that the phase diagram III.2.6 for argon would 
be rather boring, because it would just show one melting 
line going across from left to right. 


At this point two observations can be made. On the one 
hand there are universals in phase diagrams, like that con- 
densed matter will become solid under high pressure or at 
low temperatures (including the familiar triple and critical 
points). On the other hand, phase diagrams may exhibit a 
huge structural diversity that depends on the specifics of 
the constituents, whether they are simple spherical atoms, 
or composites with many internal degrees of freedom like 
water molecules. 


Crystals. The reason that solids — usually crystals — form 
is that by bringing many atoms close together the orbits 
of the electrons start overlapping and the electrons start 
moving around changing nuclear partner so to say, which 
leads to an effective attraction. However, if they get too 
close the effect of the repulsion of positively charged nu- 
clei starts to dominate. Balancing attraction and repulsion 
the atoms tend to organize themselves into an optimal pat- 
tern that minimizes their overall interaction energy. This is 
basically how crystals form. In a crystal the positioning of 
the atoms is strictly periodic which implies strong spatial 
correlations over large distances, corresponding to some 


discrete translational (and rotational) symmetries. Com- 
plexity and beauty apparently arise where attraction and 
repulsion strike a subtle balance. 


Hard versus soft condensed matter. The field of con- 
densed matter physics is divided up into two parts: soft 
and hard condensed matter physics comprising the topics 
we have indicated in Figure IIl.2.7. 


Soft matter. With soft matter we think of liquids, coloids, 
gels, molecular materials like polymers and biomaterials. 
It is a diverse field that often involves physics at an inter- 
mediate — so-called mesoscopic — scale, like nano struc- 
tures for example. This field mostly employs methods from 
classical physics, such as statistical mechanics and clas- 
sical field theory, but also lots of chemistry. It is the branch 
of condensed matter physics most remote from hard core 
quantum theory, but it has become an innovative field with 
a wide range of applications. One of its most influen- 
tial protagonists was Pierre-Gilles de Gennes of the Ecole 
Normale Supérieure at Paris, who received the 1991 No- 
bel prize for his extensive oeuvre. This field has led to 
beautiful insights into the role of symmetry and its break- 
ing. We will therefore in the following not just discuss crys- 
tals, but also liquid crystals and quasicrystals. 


Hard matter. Hard matter is the present incarnation of what 
used to be called solid state physics. It studies proper- 
ties of materials where quantum theory is absolutely in- 
dispensable. Quantum properties are vital for understand- 
ing the role electrons and lattice vibrations play. In the 
quantum realm these can rearrange themselves in col- 
lective quasi-particle degrees of freedom, with totally un- 
expected emergent properties, like superfluidity, low and 
high temperature superconductivity, and topological order. 
These latter phases, for example fractional quantum Hall 
systems, include new degrees of freedom called anyons 
with exotic spin and statistics properties. Towards the end 
of Chapter III.3 we will take a closer look at them. 
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Figure IIl.2.7: Hard versus soft. Condensed matter can be 
roughly divided up into ‘soft’ and ‘hard’ matter. Both are of great 
technological importance. 


Plasma. We have seen that for large pressures most sys- 
tems become solid. There is of course also the other ex- 
treme regime, corresponding to high temperatures, which 
is of interest as was already indicated in Figure III.2.2. For 
very high temperatures, there is yet another phase tran- 
sition: the water molecules will ionize, which means that 
they will break up in two oppositely charged components, 
the OH” and Ht ions: 


H20 — OH” + Ht 


This is again a quite different state of water. It is still overall 
electrically neutral, but it will couple strongly to electric and 
magnetic fields, because the individual components (and 
constituents) do. If you apply a voltage over the plasma, 
currents will flow, and clearly, the positive and negatively 
charged components will run in opposite directions. 


In Chapter 1.3 where we talked about fusion, we mentioned 
the crucial role played by the tritium plasma as a ‘fuel’. And 
in the previous chapter we alluded to the state of the very 
early universe as a primordial soup, this refers to a uni- 


versal plasma made up of bare ‘charges’ for all interaction 
types. Of special interest is the colored component of the 
soup called the quark-gluon plasma, which is nowadays 
studied experimentally by smashing lead ions into each 
other in the Large Hadron Collider at CERN, by the so-ca- 
lled ALICE collaboration. In that experiment one tries to 
recreate for a tiny period of time, a tiny bit of early uni- 
verse. It is fascinating to realize that not only with space 
observatories but also with big accelerators one is trying 
to get ever closer to the Big Bang and thus contributing to 
cosmology. 


Order versus disorder 


We have indicated the importance of identifying different 
phases. These are roughly characterized as ordered and 
disordered phases, but also phases that sit in between. 
Solids are highly ordered, gases are disordered, and sim- 
ple liquids tend to be more like dense gases, but if the 
constituents are more complicated they can be both. Both 
ordered and disordered! How can that be? Well, it de- 
pends on which degrees of freedom you are talking about. 
In a liquid crystal for example, the positions of the mole- 
cules are not frozen into a crystal (disorder), but the ori- 
entations of the molecules are all aligned (order). Glass 
appears to be solid, but is in fact an extremely viscous liq- 
uid. And what about gels, polymers, and biomaterials, are 
they ordered and in what ways? In a conductor the nuclei 
have fixed positions in the crystal lattice, yet at the same 
time the conducting electrons form a liquid that flows freely 
through the material. The diverse topics we have men- 
tioned so far used to belong to different fields of study but 
are more and more integrated because similar techniques 
are used to study them. 


One of the fascinating results from classical physics, in 
particular statistical thermodynamics, is that certain dis- 
ordered equilibrium states like a gas of atoms or a liquid 
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can still be rather easily described if one applies statisti- 
cal methods to them. After all, the behavior of a mol of 
a dilute gas consisting of some 10% atoms in equilibrium, 
can to a first approximation be described in terms of only a 
few macroscopic variables like pressure P , temperature T 
and a volume V and an entropy S, that have to satisfy the 
ideal gas law, PV = RT . Such a drastic reduction of vari- 
ables can be performed if one is only interested in the most 
relevant degrees of freedom that effectively describe the 
equilibrium states of the collective in a given phase. 


The gas molecules bounce around randomly, yet, even 
though the individual behavior of the atoms is highly er- 
ratic, the collective is surprisingly well behaved and highly 
predictable. As every insurance company can tell you, as 
long as the number of clients is sufficiently large, statistics 
becomes an extremely reliable tool for predicting the prob- 
ability of certain events. In the classical theory of some- 
what less diluted gasses, where one takes the size of the 
atoms and the presence of walls of the container into ac- 
count, one arrives at the Van der Waals equation of state. 
This equation is an important generalization of the ideal 
gas law from a conceptual point of view, because it pre- 
dicts a phase transition to a liquid state. We will return to 
this equation shortly. 


It turns out that the most complicated behavior is observed 
near a phase transition. There the distribution of ther- 
mal fluctuations broadens; fluctuations apparently occur 
on all scales which means that they are not distributed 
like a Gaussian distribution with a well-defined mean and 
variance around the mean. No, the distributions behave 
like power laws, where compared with the Gaussian, the 
venom is in the tail of the distribution. Whereas the ex- 
ponential distribution tends rapidly to zero, the power laws 
have so-called fat tails. These tails describe so-called ‘high 
impact, low probability’ events, but the point is rather that 
in spite of the fact that these events are far away from the 
average, their probability is actually not so small after all, in 
fact gigantic compared to an exponential distribution With 


power laws extreme events in the tail of the distribution 
cannot be discarded at all. Indeed, under such circum- 
stances, insurance brokers are not that eager anymore to 
sell you an insurance policy, and if they do, they will cer- 
tainly make you pay a good deal more to cover their sub- 
stantial risks. 


Phases, order parameters and correlations. So what 
then determines in what sense a system is ordered or dis- 
ordered? 


Order parameters. There is a special set of observables 
important for the identification of different phases: these 
are denoted as local order parameters, which are called 
local because they depend on the position x . To probe the 
difference between a vapor and its liquid state, the order 
parameter would be the local density p(x) . In the transition 
it would make a sudden jump from a tiny to a large constant 
value p(x) = pọ. For magnetic systems the order parame- 
ter is the magnetization M , which is the spatial average of 
the local magnetization M(x), which in turn corresponds 
to a local average of a sizeable number of spins centered 
around the point x. In metals spontaneous magnetization 
occurs at the so-called Curie temperature, which means 
that the magnetization M acquires a non-zero value be- 
low this temperature. So, to conclude, order parameters 
are specific observables that probe for a structural change 
in the state of the system when it goes through a phase 
transition. 


First- and second-order phase transitions. We distinguish 
two types of phase transitions called first- and second- 
order transitions. For the second order transition the order 
parameter changes continuously (but not smoothly) from 
zero to a non-zero value. A typical example is sponta- 
neous magnetization which we just mentioned and will dis- 
cuss in more detail shortly. 


Correlations. The order parameters correspond to the av- 
erage property of a local quantity. But a measure of order 
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can also be more subtle and correspond to probing multi- 
local correlations in space or time in the system. For ex- 
ample, if you have a crystal, then many of its properties are 
periodic, and strongly correlated spatially. Measuring such 
correlations can then help you identify the spatial structure 
and symmetries of the crystal. A famous technique, X-ray 
diffraction, does exactly that: it yields a diffraction pattern 
in which such spatial correlations are encoded, and from 
which the three-dimensional crystal structure can be re- 
constructed. 


First-order transitions. In a first-order transition the order 
parameter jumps discontinuously. A nice example is the 
liquid-vapor transition (evaporation or boiling), where there 
is a region in parameter space where both phases can co- 
exist, but where one of them becomes unstable. The tran- 
sition then often takes place through bubble nucleation, as 
we know too well from the ordinary boiling phenomenon. 
Inside the bubbles we have the new phase and outside 
the bubble is still the old liquid phase. Because of thermal 
fluctuations, bubbles spontaneously form in the liquid, and 
if they have a sufficient size they will start growing. The 
threshold occurs when the energy it costs to make the wall 
(proportional to the surface area of the bubble) becomes 
equal to the energy gain which is given by the energy dif- 
ference between the two phases, and this gain is propor- 
tional to the volume of the bubble. Clearly, if the bubble 
is large enough the volume term wins and the bubble will 
start expanding. If you transfer more heat to the liquid, 
more and larger bubbles will form, and those may further- 
more coalesce. This process continues until the transition 
is completed and there is no fluid left. 


In the Figures III.2.8 and III.2.9 we have depicted the liquid- 
vapor transition from two complementary points of view. 
The first figure shows the transition in a pressure-volume 
(P, V) diagram. The colored curves are different isotherms 
(curves of constant temperature). The yellow one corre- 
sponds to a high temperature and reproduces the ideal 
gas law, P = RT/V. The orange isotherm where T = Te 


a 
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Figure Ill.2.8: Van der Waals equation of state. We have 
sketched three isotherms meaning P as function of V with T 
fixed. The yellow one for T > Te, where we recover the ideal 
gas law. The orange one is for T = Te, and the purple one 
corresponds to the boiling process as described in the text. 


is special because all lower isotherms have a minimum 
and a maximum. The purple curve is the 100° Celsius 
isotherm and describes the process corresponding to the 
vertical transition marked in Figure Ill.2.4. The points on an 
isotherm supposedly correspond to equilibrium states, but 
that cannot always be the case. The segment highlighted 
with the dashed red line cannot represent physically ac- 
ceptable states because increasing the volume would also 
increase the pressure, but for physical states it is the other 
way around, the ‘compressibility’ in those points has the 
wrong sign. So only the descending parts of the isotherm 
represent allowed equilibrium states. What makes these 
curves interesting is precisely that for T < Te, we see that 
for a certain pressure range there are two possible states: 
the left one corresponding to the liquid and the one on the 
right to the vapor. The picture does immediately suggest 
the explanation. We can slowly descend the 100° isotherm 
by increasing the volume and thereby lowering the pres- 
sure, keeping the system in equilibrium until we hit the dot- 
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Figure IIl.2.9: Free energy landscapes for different tempera- 
tures. The minima correspond to the equilibrium liquid and/or 
vapor states. The yellow trajectory corresponds to horizontal 
‘boiling’ trajectory in the phase diagram of Figure Ill.2.4. The liq- 
uid state is stable, until we hit the red curve where the vapor 
minimum has lower energy and the system makes the boiling 
transition to that stable vapor state. 


ted line at the pink point (where P = 1 atm). This is where 
the boiling transition starts, and as we all know this is a 
pretty violent non-equilibrium type of process that works 
through bubble nucleation, and continues until all the liq- 
uid has vaporized and the system can restore equilibrium 
in the vapor state on the isotherm (corresponding to the 
the pink point on the right). From there the system may 
move down again if the volume is further enlarged. In the 
intermediate region during the crossing we have two coex- 
isting phases in the system, part is liquid and part is vapor. 
The whole transition trajectory marked by the dashed black 
arrow between the two pink dots, thus corresponds to the 
single pink dot in the phase diagram III.2.4. This teaches 
us that the phase diagram certainly tells us that there is 
a transition, but does not inform us in any way about how 
that transition actually takes place, and whether it is a first- 
or second-order transition. 


Minimizing the free energy. Now in the second figure, 
Figure III.2.9, we look at the first-order transition from the 
point of view of the free energy F = F(V,T) of the sys- 
tem, and this time it is convenient to take the horizontal 
trajectory in the phase diagram, corresponding to the fa- 
miliar boiling process we witness in the kitchen.! In the 
figure we plotted the free energy as a function of volume 
for increasing temperatures. The equilibrium states corre- 
spond to minima of the free energy and we see that there 
is a range of temperatures where we have two minima. 
We have a fixed amount of matter, so the left minimum 
is the small volume or liquid state, and the right minimum 
is the vapor state. We start at a low temperature equilib- 
rium state corresponding to the unique minimum. If we 
start raising the temperature, we see that the energy land- 
scape is changing. Once we arrive at the light blue isobar 
it develops a second (local) minimum, but it has higher en- 
ergy and is therefore unstable. If an outlandish fluctuation 
somewhere in the liquid happens to create a tiny vapor 
bubble, this bubble would instantly collapse because there 
is nothing to gain (energy-wise) by being a bubble. How- 
ever by going to higher temperatures the values of F for 
the two minima become equal, and on the red curve the 
vapor minimum has become clearly lower than the liquid 
one. Then indeed, the liquid state becomes metastable. 
Even moderate fluctuations will create bubbles that are big 
enough to start growing, thereby executing the actual va- 
porization process. You also see that even if we are careful 
and succeed in overheating the liquid, then you hit the dark 
blue point where the minimum corresponding to the liquid 
disappears. At that point the liquid state becomes unstable 
and the transition necessarily takes place. 


Tipping points. It is worth pointing out that the free en- 
ergy diagram is quite universal for understanding the ori- 
gin of tipping points in all kinds of multi-agent systems. 
The free energy would correspond to some relevant ‘util- 


'This is a process at fixed pressure, and is naturally presented by 
equal pressure lines or so-called isobars in a (T, V) diagram. 
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ity function’ the system wants to minimize (environmental 
constraints, costs, etc). The ‘fitness’ landscape will in gen- 
eral depend on other (control) variables. For example we 
have a society burning fossil fuels which provides us en- 
ergy for $X per kWh. Around 1970 the landscape started 
to change in that another possibility appeared, namely so- 
lar power. It is still expensive, and without some local sub- 
sidies a local effort easily collapses. However the price 
comes down rapidly, and the second minimum of the utility 
function starts competing. Ambitious countries, states and 
cities may create successful local bubbles that are eco- 
nomically feasible and start to grow. And that is how the 
energy transition will presumably take place in the present 
age. The energy transition is typically a first-order transi- 
tion, and as a matter of fact we see it happening all around 
us! This shows you the metaphorical power of the boiling 
process as a model for certain types of transitions and the 
visualization with the two competing minima as a powerful 
analogy. 


Collective degrees of freedom: quasi particles. Once 
the system has chosen a different ground state correspond- 
ing to a new phase and another minimum of the free en- 
ergy, we should ask what other aspects of the physics of 
the system have changed. Most importantly, we should 
find out what the low energy excitations of the system in 
the new state are. The low energy excitations are of in- 
terest because they are the first that will get excited if we 
perturb the system, and as such they determine more then 
anything else the emergent properties of the system in the 
new phase. These modes help also to identify and label 
the collective states. Whether it is a conductor to heat 
or electricity, or whether it is a magnetically ordered fer- 
romagnet, for example. 


What happens to a crystal if | hit it? This is like probing 
the system by locally deforming it and observing the re- 
sponse of the system to that deformation. We study how 
the deformation propagates through the system. How the 
deformation energy starts spreading. The resulting propa- 


gating modes are the low energy excitations, in this case 
they are longitudinal density waves, which correspond to 
sound. Sound is an emergent phenomenon because an 
individual atom does not know what sound is, it cannot 
make sound by itself. It needs the ordered collective to 
propagate, and in that sense it is just like the ‘wave’ that 
can be excited in a football stadium: to let it propagate 
through the crowd requires a collective effort. And if a large 
fraction of the audience are fans of the opposing team, it 
will definitely not propagate. The point | am making is that 
by studying the response of the system to perturbations we 
get to know a lot about its ground state or phase. 


In reality the molecular systems we consider are more om- 
plicated and we do not only have to worry about the po- 
sitions of the nuclei in the crystal lattice. For example, 
the nuclei may have a tiny magnetic moment, called spin, 
which means that they are like tiny bar magnets. If the sys- 
tem is at a relatively high temperature these little magnets 
will point in arbitrary directions. They are highly indepen- 
dent, and thus their orientations are uncorrelated even on 
short distances. So in this case we have that the nuclei 
are strictly ordered because they form a crystal, while their 
spins are not ordered at all. Apparently we have to be spe- 
cific if we say that a system is ordered. 


The behavior of electrons. Another crucial ingredient of 
most condensed matter systems that we have not men- 
tioned so far are the electrons. Given the underlying lattice 
structure of the nuclei, what is the quantum behavior of the 
electrons in that given background? Do they stay localized, 
close to ‘their’ nucleus, or do they start hopping around 
freely, or do they form a conducting fluid of some sort? 
It turns out that the behavior of the collective of electrons 
in condensed states of matter is highly diverse and keeps 
surprising us up to today. Understand this behavioral vari- 
ety is one of the main drivers of condensed matter physics. 
These problems have been studied for decades and time 
and again new fundamental properties are discovered of- 
ten leading to important technological innovations. We fo- 
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Solid *He 


Liquid He - | 


Liquid He — Il 
(superfluid) 


Figure IIl.2.10: Phase diagram of *He. Comparing this phase 
diagram with the conventional one of Figure Ill.2.4, a new super- 
fluid phase has opened up at low temperature, splitting the triple 
point into two triple points. The critical point is marked in green. 


cus on some of these types of behavior in the remainder of 
this section. A more thorough analysis is given in the next 
chapter. 


The quantum regime. In Chapter 1.3 we discussed scales 
and units, and pointed out that at low energies quantum 
theory necessarily comes into play, which leads to another 
plethora of conceivable physical states that can have high- 
ly unusual properties like superfluidity and superconduc- 
tivity. 


In the quantum regime we should expect that the quant- 
essential spin and statistics properties of particles come 
into play but also that the Heisenberg uncertainty relations 
will manifest themselves in the collective behavior. Of spe- 
cial interest is the possibility that bosons can occupy the 
same state. What typically happens is that once you lower 
the temperature far enough, a macroscopic number of the 
bosonic particles will occupy the same lowest energy state. 
The system forms a so-called Bose condensate, a special 


Figure III.2.11: Superfluidity. The vessel filled with liquid He 
that will turn into a frictionless superfluid. When cooled below 
the A—point, it will spontaneously creep over the wall of the ves- 
sel until it is empty. 


quantum coherent state, which means that the system will 
go through a phase transition. Systems where this hap- 
pens will exhibit ‘macroscopic quantum’ behavior. Quan- 
tum matter phases have been in the centre of attention for 
quite a long time, and still many novel phases are discov- 
ered, which pose formidable puzzles for the theorists to 
understand, like for example high temperature supercon- 
ductivity. There are still many open questions with regard 
to understanding collective quantum phenomena from a 
microscopic, first principles point of view. 


Superfluidity. Let us consider the famous example of 
Helium-4, a boson, where you can see how the quan- 
tum behavior, the formation of a Bose condensate, adds 
a new phase to the phase diagram. The phase diagram 
of Figure IIl.2.10 shows the actually not so recent discov- 
ery of superfluidity by the Russian physicist Pjotr Kapitza 
in 1937 (and independently by J.F. Allen and D. Misener).? 


?The discovery was made at a time of international tensions, and 
therefore credentials have been somewhat controversial. An interest- 
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He received the 1978 Physics Nobel prize for his landmark 
contributions to low temperature physics, of which this dis- 
covery clearly was an outstanding one. We see that com- 
pared to the standard phase diagram of Figure Ill.2.4, in 
the low temperature region the superfluid phase has been 
added. Kapitza discovered that phase by just lowering the 
temperature of some “He vapor under the standard pres- 
sure of one atmosphere, so he came from the right at the 
height of the horizontal red dotted line in the diagram. He 
first crossed the ‘standard’ transition from vapor to fluid, 
but then at a temperature of 2.17 K at the so-called A point 
he witnessed the transition to the superfluid phase. The 
ordinary triple point had ‘opened up’ and with the appear- 
ance of the new phase it split up into two triple points. A 
superfluid displays the curious property of frictionless flow, 
and therefore behaves rather ‘creepy’ in the literal sense. 
If you watch an open container filled with superfluid, you 
will see the fluid all by itself creep over the rim and run 
down the outside of the vessel. In Figure Ill.2.11 we have 
sketched an experiment along these lines: the self emp- 
tying mug! Thank heaven there is friction! Thank heaven 
that our superdrinks are not superfluids! 


Magnetic order 


Magnetization. Magnetic properties of atoms are the com- 
bined result of three components: (i) the electrons have 
spin with an associated magnetic moment of one Bohr 
magneton w; (ii) the atomic orbits of electrons correspond 
to states with a magnetic quantum number m, which means 
that the magnetic moment of the orbit equals muy; and (iii) 
finally there is the nuclear magnetic moment which turns 
out to be a factor thousand smaller. We will not enter 
in any detailed discussion of how these interact but will 
just assume atoms, ions, or electrons to have some over- 


ing historical account can be found in S. Balibar, The discovery of su- 
perfluidity, Journal of Low Temperature Physics, Vol. 146, Nos. 5/6, 
2007. 


all spin or magnetic moment. For the spins we can now 
also introduce an order parameter, it is called the magne- 
tization M(x) , the average magnetic orientation of certain 
number of spins around the point x. If the temperature is 
high we know that because of the random orientation of 
the spins, the average magnetization (M(x)) in the ground 
state will be zero. But if we cool the medium down, then the 
disturbances in the lattice become smaller and the mag- 
nets will feel each other and can lower the energy of the 
state by aligning, in which case a phase transition will take 
place. 


Phase transition at the Curie point. At a certain temper- 
ature called the Curie point there will be a phase transition 
to a state where all spins will spontaneously align. Order 
is spontaneously created and the order parameter will ac- 
quire a non-vanishing constant, that is to say a position in- 
dependent value: (M(x)) = mp 4 0. This emergent form 
of order is called spontaneous magnetization, and the sys- 
tem is in a ferromagnetic phase and as a whole behaves 
like a single big magnet. So we may conclude that ordinary 
permanent magnets are made of materials of which the 
Curie temperature lies far above room temperature. And 
as expected the order parameter thus signals whether the 
system is ordered or disordered. 


Low energy modes: spinwaves. The low energy modes 
associated with the magnetic spins in a ferromagnet are 
the so-called spin waves. You may compare them to the 
waves that a light breeze can excite in a field of grain as we 
described in the section on symmetry breaking in Chapter 
II.6. These are again collective excitations of the ordered 
spin system with a wavelength that is long compared to 
the distances between the spins and because they have a 
long wavelength they are low energy excitations indeed. If 
we quantize these waves we get particle like excitations or 
quasi-particles called magnons. 
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The Ising model 


Let us take some time to discuss an absolutely iconic model 
that instantly comes to the mind of any physicist when you 
mention the word phase transition. It is called the /sing 
model, cherished for its simplicity and its depth, which was 
introduced by Wilhelm Lenz in 1920. He suggested it as a 
problem to his student Enst Ising, who then solved the one- 
dimensional version of it and found that it had no phase 
transition. Moreover, he erroneously concluded that there 
would be no phase transition in any dimension. How ironic 
that Ising’s ‘fame’ in physics is based on drawing a wrong 
conclusion from an elementary calculation. It is precisely 
that two-dimensional version we are going to discuss, which 
has for along time been the canonical model for a (second- 
order) phase transition. It was solved exactly by Lars On- 
sager in 1944, who reportedly at a conference just wrote 
down the exact answers on a black board without further 
explanation, leaving the learned audience flabbergasted, 
and with a nice problem to work on! The problem of figur- 
ing out how he did it. It is one of those models to which 
a tremendous amount of work has been devoted. It has 
popped up in all subfields of physics and beyond. 


As mentioned before, we distinguish the ordered ferromag- 
netic phase where all spins are aligned, and the non-mag- 
netic phase, where the spins point in random directions. 
Here the order does not concern the spatial positioning but 
the orientation of the spins. As we pointed out, in the or- 
dered phase the magnetization is some non-zero constant 
while in the disordered phase it is equal to zero. To be 
precise there is a different ordered phase which is called 
anti-ferromagnetic, where the spins at neighboring sites 
are anti-aligned. 


The Ising Hamiltomian. The classical Ising model has 
an infinite array of spins that can only point up or down. 
A two-dimensional Ising model configuration is depicted in 
Figure Ill.2.12. The spins oi = +1 only interact with their 


YC) 
EX: 
=> 4 = 

=|| e 


-|e 
=> |= = =) = 
e 
«i= 


Figure IIl.2.12: Ising model. A two-dimensional Ising system 
of spins that can only point up or down. Here the system is in 
a disordered state, where the spins are randomly pointing up or 
down. If you think of these as nuclear spins you see that the 
spins are neatly ordered spatially on a cubic crystal, but that 
the spin orientations are disordered. So order and disorder can 
peacefully coexist if they refer to different degrees of freedom. 


nearest neighbors, and the contribution to the energy of 
any pair of neighbors is, 


he) = -5 Jij 9495, 
ij 


where Jij is the interaction parameter. If Ji; = O there 
is no interaction, whereas if the coupling is constant and 
positive, Jį = J > 0, then we have a ferromagnetic sys- 
tem, and if the constant J is negative we have the anti- 
ferromagnetic case. If the couplings Jij are chosen ran- 
domly, then we speak of a spin glass. For simplicity we 
have left out a term for the coupling of the spins to an ex- 
ternal magnetic field. Let us consider the ferromagnetic 
case, If a pair of neighbors has the same spin, the con- 
tribution to the energy is minimal, whereas if the spins 
are opposite the contribution is maximal. The total energy 
equals the sum of all pair contributions. For the ferromag- 
netic case, the minimal energy configuration is therefore 
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the one where all spins are the same, either all up or all 
down. 


The Ising partition sum. The probability for a configu- 
ration to occur is given by the Boltzmann factor we intro- 
duced in the section on Statistical Physics in Chapter 1.1: 


P(o) = Zg ) 


where the normalization factor Zg is the partition sum: 


Z(B) = e BH{o) 


oO 


Having the probability distribution of configurations, we can 
define averages, or expectation values. The free energy is 
defined as F = —ß—! log Z, and the thermal equilibrium 
states correspond to the minima of the free energy. 


Ising magnetization. To obtain the magnetization we first 
average the spin over all sites in a given configuration: 
Mo = } ; 0i/N, the thermal average is then given by 


M = (Mo)p =) Mo Po. 
oO 


In Figure Ill.2.13 we have depicted three configurations, 
representing the ordered and disordered phases, wich a 
critical configuration in between. 

Order. In the ordered, low temperature phase the domains 
are macroscopic (the lowest energy configuration is just a 
single domain with all spins up or all spins down). In the 
ordered phases the magnetization would be M 4 0. 
Disorder. On the right we see a configuration correspond- 
ing to the high temperature disordered phase, where there 
are basically no domains. The individual spins are just 
randomly pointing up or down, and consequently the mag- 
netization would equal zero. 

Critical. In between is the critical case where the temper- 
ature equals the critical temperature Te, where there are 
domains of all possible sizes. In fact this critical case is 
special in the sense that it is scale invariant, meaning that 


Ordered Critical 


macroscopic domains domains on all scales 


Disordered 
no domains 


Figure III.2.13: Magnetic order and disorder. We see the states 
of an Ising model without external magnetic field. At low temper- 
atures the state is ordered, and spins are aligned over macro- 
scopic distances, while at high temperatures the state is disor- 
dered and there are no domains, just individual spins randomly 
pointing up or down. In between there is a critical point, where 
there are domains of all sizes. The critical Ising model is scale 
invariant. 


if you enlarge the picture and cut out a piece of the original 
size, it would not be possible to distinguish it in a statistical 
sense from the original one. It is self-similar in a statistical 
sense. 


Mean field theory. One can make an illuminating ap- 
proximation of the model as a mean field theory. One 
approximates the spins by the local magnetization field 
M(x). Clearly this approximation will break down for small 
distances. It is possible to write an effective free energy 
F(M,T) in terms of this field M(x) this is known as the 
Landau theory. Because of the symmetries in the model it 
will only have even powers of the field and in low order it 


will look like: 
F(M, T) = pM (x)? + AM(x)*, (11.2.1) 


where the parameter A > 0 (the free energy is bounded) 
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F(M,T) 


M = <M(x)> M 


disorder 
M=0 


Figure Ill.2.14: Second-order transition. We have plotted the 
free energy F as a function of the order parameter M (mag- 
netization) for three values of the temperature. At T > Te the 
symmetric minimum is at M = 0 (no magnetization). At T < Te 
the minimum is at M Æ 0 (spontaneous magnetization), and the 
system will ‘choose’ the red or the blue minimum. This is an 
example of spontaneous symmetry breaking. 


and the other parameter u has a temperature dependence 
which near the critical point is given by u = tto(T — Te). 
In Figure Ill.2.14 we have plotted the free energy F(M, T) 
of the system as a function of the average magnetization 
and the temperature. We see from the figure that the mini- 
mum of the free energy for T > Te yields the value M = 0, 
and for T < Te we see that the minimum of the free energy 
corresponds to a non-zero value for M. The latter is the 
situation where the symmetry of F is spontaneously bro- 
ken in the sense that the system has to choose one of the 
two degenerate groundstates, with all spins up or all spins 
down. For T = Te the system is in the critical state, where 
the free energy curve flattens out (u = 0). The vanishing of 
the quadratic curvature term means that the spin wave ex- 
citations have effectively a zero mass (they are ‘gapless’). 
And this is what gives rise to the power law behavior of the 
correlation functions as we will discuss next. 


Figure III.2.15: Ising model phase diagram. The vertical axis 
is in fact the phase diagram of the Ising model (without external 
field). It has only one control parameter which is the tempera- 
ture. We have plotted the average spontaneous magnetization 
as a function of temperature. If we lower the temperature the 
minima of the free energy in the previous figure trace out the 
blue and red curves giving M for T < Tc. 


In Figure III.2.15 we have summarized the results. Along 
the vertical axis we have a one-dimensional phase dia- 
gram with temperature as the only control parameter. For 
low temperature the phase is ordered, and above the crit- 
ical temperature it is disordered. In the same graph we 
have plotted the order parameter, which is the magnetiza- 
tion M along the horizontal axis. The magnetization tends 
to M = +1 as temperature goes to absolute zero. We 
see that the order parameter as a function of temperature 
changes continuously in this case, which means that we 
are dealing with a second-order phase transition. 


Correlation functions. A meaningful probe of order and 
in particular of critical behavior are the spatial correlation 
functions for large distances. For the Ising model, one cal- 
culates the thermal average of the product of two spins oj; 
and gj but now as a function of their separation |i —j|. The 


504 CHAPTER Ill.2. THE SPLENDID DIVERSITY OF CONDENSED MATTER 


const. 
ordered 


iy 


(0:05) 6 = critical 


disordered 


Figure III.2.16: Correlation functions. The typical behavior of 
the spin-spin correlation function (o,0;)g in the three regimes 
of the Ising model. 


expression is as follows: 


f(i—j) = (01 0j)g. (11.2.2) 


It is simplest to first consider the case at low temperature 
where there is long range order. This would be reflected 
in the correlation function to be a non-zero constant. On 
the other hand, if the system is disordered one expects 
the correlations to be short range, and indeed the the cor- 
relation function can be calculated to decay exponentially 
over a characteristic length called the correlation length 
&. We have summarized the distinct functional behavior 
of the correlation functions in the three regimes in Fig- 
ure II1.2.16. 


Critical behavior. The behavior at the critical point, the 
phase transition itself, is of great interest. It turns out 
that the transitions show a high degree of universality The 
correlation functions for example, behave as power laws, 
which means that for large x they behave like f(x) ~ x*. 
Such functions are characterized by a power « which is 


called a critical exponent. These exponents express the 
characteristic quantitative behavior of correlation functions 
in the critical state, between the ordered and disordered 
phase. In fact as we approach the critical point from the 
disordered side on finds that the correlation length (T) di- 
verges, so, limy_,7, &(T) — oo. This is precisely why the 
exponential decay law in the disordered phase changes to 
a power law at the critical point. 


Universality. It turns out that different types of systems 
have identical critical behavior meaning that they have the 
same set of critical exponents at the critical point. These 
exponents do not depend on the microscopic details of 
the model but rather on the number of dimensions and 
the symmetries of the system. The fundamental symme- 
try underlying second-order phase transitions is scale and 
conformal invariance, which can than be extended in var- 
ious ways to obtain the different universal behaviours. So 
the critical behaviour of the 2 dimensional Ising model can 
for example be described on a free massless (Majorana) 
fermion field. Which means that the spin and energy corre- 
lation functions of the two models show exactly the same 
critical exponents. So it is also in this field of research 
that symmetry arguments can greatly advance your un- 
derstanding observed phenomena. The critical exponents 
label the representations of the group of certain conformal 
symmetries in two-dimensions. 


Anti-ferromagnetism. Now in magnetism there could be 
another type of order referred to as anti-ferromagnetism, 
where the neighboring spins tend to point in opposite di- 
rections. This corresponds to choosing the coupling pa- 
rameter J in the energy expression to equal J = —1. The 
ordered, low temperature, lowest energy configuration now 
corresponds to a red/blue checkerboard configuration. And 
the magnetization as defined above would also give zero 
for this ordered phase. This just illustrates the fact that one 
has to have some clue or make an educated guess, about 
what the state looks like before one can come up with a 
sensible type of order parameter. Here we can the repair 
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Figure Ill.2.17: The loop representation of the Ising model. We 
illustrate an equivalent or ‘dual’ representation of the Ising model 
where the states are represented as connected oriented paths 
along the links of the lattice. For any possible pair of neighbors 
there is a unique prescription. If the paths cross at some vertex 
there are always two arrows pointing towards and two arrows 
pointing away from the vertex. 


the definition of the magnetization quite simply by adding 
an extra minus sign on all odd sites, for example. And — 
as we will see — there are ordered phases in the quan- 
tum regime where there no local order parameter can be 
defined. 


Domain walls and defects. If we think again of the energy 
associated with a neighbouring pair, we have £ = 0 if the 
spins point in the same direction and £ = 1 if they are dif- 
ferent. Now with this we can construct a dual representa- 
tion of the Ising model, in terms of oriented contours along 
the edges of the (dual) lattice. We have depicted this cor- 
respondence in Figure IIl.2.17, For any pair of neighbors 
we draw an arrow along the edge they have in common 
if the spins are opposite, or no arrow if the spins are the 
same. If you now look at a large configuration, then the 
spin configuration uniquely corresponds to a configuration 


of oriented lines. There is one subtlety that is clear from 
the last picture in the figure, if two lines cross, then you 
always have two arrows pointing in and two out, and this in 
turn means that there are two options for how to connect 
the lines at the crossing. If we have a blue domain inside 
a red domain, that would yield a closed boundary oriented 
anti-clockwise, and if we exchange the colors, the orienta- 
tion would flip to clockwise. This representation in terms 
of these boundary contours or domain walls immediately 
makes manifest where the energy is located. The walls 
cost energy (because they coincide with a pair of differing 
neighbors), and the total energy equals the total length of 
the domain walls. In the ferromagnetic ground state there 
are no walls, and therefore a domain wall is called a de- 
fect. It is a topological defect, away from the boundaries 
of the sample the walls form closed loops which cannot 
break. The loops can grow or shrink, they can join or break 
up, they can disappear or being created, but a wall cannot 
have an endpoint in the sample. So you can also think 
of the Ising model as a ‘gas of loops’, with the additional 
property that the loops don’t intersect. You may check this 
by looking at any would-be intersection of the walls, and 
note that the two ingoing arrows can be connected to the 
two outgoing arrows only in two ways. Drawing these one 
finds that they do not cross, indeed. the loops avoid them- 
selves and others. 


A dual representation. The two dual representations, 
one by spin and the other by loop configurations, provide 
two complementary perspectives on order versus disor- 
der. Starting in the ferromagnetic phase from zero tem- 
perature, there are no defects, and it is by raising the tem- 
perature that the loops are created, and by the time we 
are in the disordered state, the loops have ‘condensed; 
there are defects everywhere. A maximal energy state is 
one where there is a defect on every link which happens to 
correspond to a perfectly anti-ferromagnetic state. And in- 
deed changing the sign of the neighbor-coupling J exactly 
exchanges the highest and lowest energy states of which 
there are two each. 
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Figure III.2.18: US voting patterns. An Ising model represen- 
tation of voting patters of the 2012 and 2016 elections. In the 
bottom figure you see the shift. Indeed the swing states are on 
the boundary, the shift involves moving the 2012 boundary. If 
you create a new island you are considered a defect and it costs 
a lot of energy, there is a high threshold. Building domain walls 
is also costly but not as expensive as creating a new domains. 


These considerations illustrate a quite general principle, 
that defining a certain type of order in a system usually 
also implies the existence of certain types of defects, both 
topological and non-topological. This is not only true for 
spin systems but for most forms of order. As will be dis- 
cussed in the next section, crystals for example have all 
kinds of defects, of which the dislocations and disclinations 
are the most well-known. These defects have their own dy- 
namics, for example if we prepare the spin system starting 
from high temperatures by quenching it, meaning cooling it 
fast, then the loops will not have time enough to annihilate 
and the defects get frozen in. If, on the contrary, we cool 
it slowly, then we may end up with a perfect ground state 
as the defects had enough time to pair up and annihilate 
each other. 


Swing states 


G: Hey Orange! | really like that stuff you are talking 
about. 

O: Thank you Green. It took me quite some effort 
to master this subject, so | am glad to hear you like 
it. 

G: You know, Orange. | think this stuff may have 
great applications. 

O: But Green, this is pure science just for the sake 
Of 

G: All that blue and red, that order and disorder, 
those arrows up and down. It really did make me 
think of the elections! 

O: But Green, ... 

G: Those walls, you know. And how hard it is to cre- 
ate blue bubbles in the red domains. 

O: But Green, ... 

G: You see, if you take the voting patterns of 2012 
and you take those of 2016, and you look at what 
happened. 

O: But Green, ... 

G: Yes, Orange, yes! Look at that, the swing states 
are right there bordering on the walls. That’s exactly 
where all their campaign money and energy went, 
and yes, that’s where they got the walls moving. Chr 
chr. 

O: Green! Stop it. 

G: And no red bubbles in the blue, and no blue bub- 
bles in the red. Just like you said. 

O: That’s no science, Green! 

G: Hey those swing states are just defects, and 
nothing happens elsewhere. 
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O: Stop it! 

G: | wouldn’t call that a landslide! It’s all in the mar- 
gins, Orange. In spite of all excitement and heated 
discussions, we are dealing with Ising system at 
low temperature, with some domain walls frozen in. 
Don’t you think that is a comforting thought. 

O: Oh, Green, | wished | never told you. 

G: The Ising model of voting! Chr chr. Maybe we 
should start working on that phase transition, Or- 
ange! | mean, what would it be like to live in an 
anti-ferromagnetic country? They call it disorder, 
but didn’t you just say that it just a different type 
of order? You know, the colors mix well, and | didn’t 
see a glass ceiling either. There are so many walls, 
that it is just like having none! 

© Ohno: 

G: Oh Yes. | think we should start working on 2020 
and 2024 elections including terms for fraud and 
outcome denial! Chr chr. L 


It directly follows from simple energy considerations that it 
costs more to create a red site in the middle of a blue do- 
main (four units of energy), while moving a red boundary, 
which means changing a blue to red site at a boundary 
always costs less. From this local energy perspective it 
is also clear that domain walls will have the tendency to 
straighten out. 


Defect condensation and dual order. The state we have 
described as disordered, where the spins are randomly 
distributed, can be considered from the dual point of view 
as a state where there are defects all over the place. If 
we were to define a dual order parameter measuring the 
average number density of wall segments or links on the 
dual lattice, it would be non-zero. In other words, it is a 
kind of dually ordered phase where the defects have con- 
densed. 


Crystal lattices 


Symmetry reigns. At low temperatures or high pressure 
atoms (or ions) tend to settle down in periodic arrays which 
correspond to a crystal lattice. A characteristic of such a 
lattice is that it is periodic, and there is a certain basic ge- 
ometric pattern — called a unit cell — that repeats itself over 
and over again. So if you move the (infinite) lattice over 
a certain distance in certain directions it looks exactly the 
same, and the same is true if one rotates around certain 
axes by particular angles or reflects the lattice about in cer- 
tain planes. The lattice can be characterized by the set of 
symmetry operations that leave the lattice invariant. These 
operations form intricate infinite discrete groups, consist- 
ing of discrete translations and rotations. 


Wallpaper groups. The five basic space filling lattices in 
two dimensions and their corresponding space groups have 
been constructed, they form the so-called wallpaper groups 
and there are a total of seventeen of them. 


The Bravais lattices. The space-filling crystal lattices have 
been classified by the nineteenth century French mathe- 
matician Auguste Bravais. In two dimensions there are 
five different lattices. In three dimensions there are seven 
basic lattices to which special points may be added, mak- 
ing a total of 14 Bravais lattices. Not surprisingly there 
is an awesome jargon that comes with them in order to 
distinguish them, involving terms like cubic-face-centered, 
orthorhombic, triclinic, rhombohedral and so on. In par- 
ticular cubic-face-centered sounds to me like a fancy Al 
surveillance algorithm! 


For the 14 space-filling, three-dimensional lattices, the space 
groups have been fully classified and everything is known 
about all 230 of them. This means that also the point 
groups preserving the unit cell in three dimensions are 
known and there are 32 of them. 
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Figure IlI.2.19: Symmetries of octahedron. The embedded 
octahedron has the same symmetry group as the cube. 


X-ray diffraction and more. Crystal lattices can be stud- 
ied experimentally by short wavelength photons (X-rays). 
The X-rays scatter from the nuclei on the lattice sites and 
the scattered waves will interfere with one another. So 
whether we get reflection of diffraction depends on whether 
the interference of the many scattered waves is construc- 
tive or destructive. The crystal has planes of atoms in 
various directions and the photons may be diffracted or 
reflected depending on whether their momentum satisfies 
certain conditions which are determined by the specific ge- 
ometrical properties of the lattice. From the reflection and 
diffraction patterns one can then reconstruct the geometry 
of the lattice. 


This widely applied technique of studying molecular order 
whether it is lattices or complicated molecular structures 
like DNA? was invented by the British physicists William 


3There is the (in)famous story that Francis Crick and James D. Wat- 
son discovered the structure of DNA in 1953 after Maurice Wilkins 
had shown them a diffraction pattern measured by Rosalind Franklin 
at King’s College London. It held the clue to the spatial structure of the 
double helix. 


Henry Bragg and his son William Lawrence who shared 
the Nobel prize for Physics in 1915. The application of 
the technique to the complicated molecules of life was pi- 
oneered by Max Perutz, an Austrian refugee, who got a 
position at the Cavendish laboratory in Cambridge with 
the Braggs. Nowadays we can probe the surface of solids 
on atomic scales by advanced microscopes, the scanning 
tunneling microscope (STM) or the atomic field microscope 
(AFM). But the 3-D imaging is still of the diffractive type. 
These probing techniques are — not surprisingly — based 
on quantum principles themselves. 


There is the remarkable fact that if you want to probe na- 
ture at some scale then nature often also provides you with 
the tools which are operative at the same scale, that allow 
you to build suitable probing devices. It is a matter of giving 
and taking. This is true for atoms with visible light, for nu- 
clear structure using nuclei (alpha particles), and is true for 
genetic manipulation using all sorts of enzymes etc. 


Kitchen salt or the cube. Let us now look in more de- 
tail at some three-dimensional lattices. A well-known ex- 
ample in three dimensions is the kitchen salt or sodium 
chloride (NaCl) crystal, which is a simple cubic lattice 
with the sodium and chloride atoms occupying alternat- 
ing sites (see Figure lll.2.20(c)). The point group of the 
cubic lattice, which is the symmetry group of the cube, 
is surprisingly rich and consists of 24 elements. As indi- 
cated in Figure Ill.2.20(d), it has four threefold axes (rota- 
tions around main diagonals), three fourfold axes (around 
lines through centers of opposite faces), and six twofold 
axis (through centers of opposite edges). This group is 
denoted by O and called the octahedral group, because it 
is also the symmetry group of the octahedron obtained by 
drawing the planes through the face centers of the cube, 
as one may see from Figure III.2.19. Indeed, correcting for 
the identity element we verify that the group has indeed 
14+3x3+4x2+6x 1 = 24 elements. The trans- 
formations we discussed so far are all rotations, but there 
is one more transformation that leaves the cube invariant, 
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(a) The Cubic Space Division by M. Escher. It is invari- (b) Kitchen salt crystals of about 10 micrometers. Image taken with 


ant under translations by the lattice constant a along environmental scanning electron microscope (ESEM) at 950° C. 
the x, y and z axes. (© 2023 The M.C. Escher Com- 
pany.) 


(c) The crystal of kitchen salt or sodium chloride (NaCL). It is a sim- (d) The symmetries of a cube. It has three fourfold axes (blue), four 

ple cubic lattice with alternating sodium (purple) and chloride (green) threefold axes (red) and six twofold axes (green). The set of all transfor- 

ions.(Source: MIF Univ. of Calgary.) mations that leave the cube invariant is the orthohedral group O ; it has 
24 elements. 


Figure III.2.20: The symmetries of the cube. 
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(a) The facets of a diamond are designed to (b) A cubic face-centered (fcc) lattice cell. (c) A second fcc lattice superimposed at the 


maximize its reflections. point (+, 7, +) in blue. 


(d) The resulting diamond lattice as a stacking (e) The lattice as a stacking of planes with tetra- (f) The lattice stacking of planar triangular lat- 
of tetrahedra. hedra (with center). tices. 


Figure III.2.21: The diamond lattice. The intricate diamond lattice and some ways to look at it which display different aspects of its 
symmetry. 
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namely inverting all coordinates, which amounts to mirror- 
ing every point of the cube in the origin. This is called the 
inversion or parity operation P . If we add this transforma- 
tion, we get a group denoted by O+ with 48 elements. This 
group is non-abelian (not all elements commute with each 
other) as one can easily check. 


Diamond. Crystal lattices clearly exhibit intricate and aes- 
thetically pleasing features and coincidences. A nice ex- 
ample is the diamond lattice which is more involved and we 
explore different perspectives on it in Figure Ill.2.21. The 
structure is built-up of two cubic face-centered (fcc) lattices 
which are shifted with respect to one another (III.2.21(b) 
and III.2.21(c)). The result is a perfect three-dimensional 
stacking of tetrahedra (III.2.21(d) and IlIl.2.21(e)): the cor- 
ners are all on the first fcc lattice while the centers are 
on the second fcc lattice. The lattice can therefore also 
be viewed as a stacking of planes with tetrahedra. One 
can go one step further and think of the whole lattice as a 
stacking of pairs of strictly identical triangular lattices, one 
of each fcc lattice. In Figure III.2.21(f) we show the three 
top layers of subsequent pairs, which all belong to the first 
fcc lattice. Projecting all the points down along the body di- 
agonal, which is perpendicular to the layers one finds that 
there are three inequivalent triangular lattices in the figure 
corresponding to the blue, red and green layers. 


The uses of symmetry. lt turns out that the symmetry 
group tells us a lot about the physics of the system; it not 
only characterizes the stable equilibrium or ground state, 
but also provides a natural labelling of the low energy modes 
that can propagate through the system. The symmetry 
teaches us also about properties of the spectrum of elec- 
trons. And finally the symmetry group of the lattice de- 
termines the possible lattice imperfections or defects that 
may occur. 


As we live in three-dimensional space most of us will agree 
that our analysis should stop there. The classification of 
space groups and lattices in higher dimensions is to be 


considered a mathematical pastime at best. But nature 
had a surprise in store. Who would have expected that 
higher-dimensional regular lattices would rear their heads 
also in our three-dimensional world in the guise of so-ca- 
lled quasicrystals. 


This provides another striking example of the ‘unreason- 
able effectiveness of mathematics in the natural sciences; 
which refers to the title of a famous lecture by Eugene 
Wigner who got the Physics Nobel prize in exactly for his 
work on group theory and its many applications in quantum 
theory. We will return to quasicrystals towards the end of 
this section. 


Crystalization and symmetry breaking 


We introduced and expanded on the concept of symmetry 
breaking in Chapter II.6. It has many beautiful applications 
in condensed matter, and in particular also in the theory of 
crystallography. In this section we explore two representa- 
tive examples. 


The concept. Suppose one of the atoms in a simple cubic 
lattice is of a different type, say it is has a different color, 
then we may ask for the transformations that leave not 
only the cube invariant but also keep the colored atom in 
place. For this case the answer is quite obvious from Fig- 
ure III.2.20(d). If we ask which transformations leave not 
only the center but also one of the red dots in place, then 
we are only left with a single threefold axis. This means 
that the rotation group G = O is reduced to, or as is often 
said, broken to, H = C3. This reduction of the symmetry 
from a group G to the so-called residual symmetry group 
H, which is a subgroup G, means that certain degenera- 
cies in the spectrum that occurred in the unbroken situa- 
tion will now be lifted. So one could say that breaking the 
symmetry allows for less uniformity and more differentia- 
tion. 
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Symmetry breaking is therefore an invaluable tool to anal- 
yse and interpret experimental data. In particular if we 
have certain external parameters we can change, like tem- 
perature or electric or magnetic fields, it may be that what 
appeared as one state breaks up in a set of different states. 
Then different states with the same energy may split up 
in states with different energies, much in the way we dis- 
cussed in Chapter 1.4 in relation to the Zeeman effect. 
There the spherical symmetry of the atom was broken by 
the direction of the external magnetic field, and the spec- 
tral line was split because the degeneracy of the states 
was lifted. Symmetry breaking may also happen sponta- 
neously if one lowers the temperature, as is the case with 
‘spontaneous magnetization’ in a magnet as described by 
the Ising model. Even without an external magnetic field 
the spins may line up because of their local ferrromagnetic 
interactions. 


You now may also understand that the formation of a lattice 
itself, the process of crystallization, is an example of spon- 
taneous symmetry breaking. You should think of starting 
with a liquid which we envisage as a continuum. If you are 
at some point in the liquid it looks the same, independently 
of what point you chose, and it looks also the same in all 
directions. A simple fluid is therefore said to be homoge- 
neous and isotropic. This translates in the statement that 
the symmetry of a simple liquid consist of all rotations by 
any amount about any axis, and also of translations in any 
direction by any amount. Clearly this group is continuous 
and is called the Euclidean group E3 of three-dimensio- 
nal rotations and translations we mentioned before. It is 
the symmetry group of empty three-dimensional Euclidean 
space. So crystallization is a process where the symme- 
try gets broken from the Euclidean group to the symme- 
try group of the lattice, which is a discrete subgroup of 
E3. 


Goldstone modes. We have in the section on symmetry 
breaking of Chapter II.6 mentioned how breaking of a con- 
tinuous (global) symmetry leads to the existence of mass- 


less modes. This is precisely what happens upon crystal- 
lization, where the Euclidean group gets broken to the dis- 
crete lattice group. The low energy modes correspond to 
the sound modes that can propagate through the crystal. 
They are the Goldstone modes which are associated with 
the breaking of the continuous translational symmetries of 
the perfect fluid. In Figure III.2.22 we give the pictorial ac- 
count. From (a) to (b) the crystallization takes place. In 
Figure (c) we have sketched a sound mode corresponding 
to a longitudinal pressure or density wave that propagates 
through the crystal. 


Topological defects. There is an additional observable 
consequence of broken symmetry in the situation we are 
discussing. Broken symmetries manifest themselves not 
only in lifting degeneracies and the presence of particu- 
lar low energy modes, but also in the presence of defects, 
called lattice defects in the case at hand. The theory pre- 
dicts that if we break the continuous group E; to the dis- 
crete group of the cubic lattice, we have line defects that 
we in principle can label by the elements of the symme- 
try group of the lattice. In a crystal we typically distinguish 
two kinds: translational defects called dislocations and ro- 
tational defects called disclinations. We have illustrated 
them for a two-dimensional lattice in the pictures (d) and 
(e) of Figure IIl.2.22. 


Dislocation. In the bottom left Figure III.2.22(d) the dark 
atom is special, since it marks the endpoint of an extra 
vertical layer that does not go all the way up. Note that far 
away from the marked atom the lattice has restored itself 
to its normal unperturbed form. The marked atom is an 
irregularity, a defect. How do you quantify the defect? In 
this case you should compare the near environment of a 
normally positioned atom with that of a defect. If you walk 
around a normal atom like the one marked in the upper 
left corner, following the blue arrows you see that it takes 
8 steps to get back. If you take 8 steps around the defect 
site, you go one step too far, and you have to move back 
by one lattice vector (marked in yellow). This translational 
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(a) A gas or liquid made of simple atoms. It (b) Upon cooling the atoms may ‘freeze’ and (c) A sound wave propagating horizontally. 


has no long range order and therefore effec- form a regular lattice. The symmetry corre- Sound is a periodic density fluctuation in the 
tively a continuous translational and rotational sponds to a space group consisting of discrete direction of the motion (longitudinal). The 
symmetry. translations and rotations. atoms are coherently moved out of their equi- 


librium position. 


(d) The empty site is a translational defect, (e) A rotational defect (disclination) related to 

also called a dislocation, because when going a rotation over an angle of 90° . One sees the 

around it in 8 steps one’s position is shifted by defect angle if one carries a little vector tied 

one lattice distance. As indicated, away from to the local lattice frame around the defect. 

the defect the lattice is restored. Starting on the left we obtain a defect angle 
of 90°. 


Figure II|.2.22: Defects and broken symmetry. We show two types of lattice defects in a simple two-dimensional crystal. 
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defect is thus labeled by a translation vector (also called 
a Burger’s vector), and that elementary translation corre- 
sponds to a basic element of the discrete translation part 
of the lattice group. One encounters this one step disloca- 
tion on any loop around the defect location. Therefore the 
defect is uniquely labeled by this group element. 


It is easy to imagine that in the process of crystallization 
such dislocations may form spontaneously. The number 
of dislocations one finds will depend on how fast we cool 
the system down. It is clear from the picture that the de- 
fect locally deforms the lattice and therefore will carry a 
certain amount of extra energy. The dislocation in two di- 
mensions is a point defect, and it is stable for a topological 
reason. You may be able to move it around, but you can- 
not smoothen it out locally. You can think of the dislocation 
to be connected (via the extra layer) to either the bound- 
ary of the sample or to an ‘anti-defect’, which means that 
these defects are locally stable but can annihilate with an 
anti-defect. 


Disclination. In the bottom right picture we show a discli- 
nation or rotational defect. This defect is labeled by the 
‘defect angle’ you encounter as you parallel transport a lo- 
cal lattice vector (or frame) around the defect. If in the fig- 
ure we take the local blue vector smoothly along the green 
path around the defect and return to the starting position, 
the vector has rotated over an angle of 90°, and again 
this is an element of the symmetry group of the lattice. 
This analysis reminds us of our considerations in the sec- 
tion on curved spaces in Chapter |.2, where we discussed 
this characteristic and called it a non-trivial holonomy. It 
requires a lot of energy to make a disinclination. They 
may spontaneously form in small samples, and alterna- 
tively you can also imagine ‘growing’ the crystal starting 
from the impurity outward. That way the fivefold symmetry 
would be introduced ‘by hand.’ It is not a lattice in the nor- 
mal sense because the translational symmetry is broken 
right from the start of the growing process, that is the price 
for having a fivefold rotation symmetry in the plane. 


Liquid crystals 


We have alluded to the importance of the shapes of con- 
stituent particles for understanding their collective behav- 
ior. This hidden underlying geometry is one of the keys 
to the diversity that is displayed in properties of materi- 
als. In the previous section we showed that these shapes 
can often be translated into symmetries or their breaking. 
A splendid example of this are the types of order/disor- 
der that arise in soft condensed matter physics, in par- 
ticular the subject of liquid crystals and nematics. With 
the language of symmetry at hand we are able to give 
some qualitative characteristics of the materials straight- 
forwardly. The examples are quite easy to visualize and 
are used to further illuminate the rather abstract notion of 
symmetry breaking. 


Partial order. As mentioned, an ordinary lattice is an ex- 
ample where we break the continuous Euclidean group E3 
down to an infinite discrete group of translations and ro- 
tations. It is not so hard to imagine that media can have 
strange mixtures of order and disorder which are in be- 
tween a liquid and a crystal. In such cases the transla- 
tional symmetry is not broken but the rotational symmetry 
is: the system is partially ordered. These types of sys- 
tems can easily be visualized by assuming that the build- 
ing blocks have simple geometric properties, for example 
they are like tiny rods or pancakes or tetrahedra. 


Nematics and smectics. |n Figure II|.2.23 we illustrate var- 
ious possibilities if the constituents are rod-shaped. They 
can form an ordinary liquid or a fully ordered crystal, with 
both translational and rotational order. In Figure III.2.23(c), 
however, they form a two- or three-dimensional structure 
which preserves orientational order with translational sym- 
metry, which is a liquid crystal called a nematic. The next 
picture shows another realization: the rods are oriented 
along the z direction. Furthermore the rods form strict hor- 
izontal layers, but within the layers there is free motion. 


LIQUID CRYSTALS 


515 


(a) A nematic liquid made of simple rod-shaped 
atoms. It has no long range order and therefore 
effectively a continuous translational and rota- 
tional symmetry. 


(b) Upon cooling the atoms may ‘freeze’ and 
form a regular lattice. Translational and ro- 
tational symmetries are broken to a discrete 
space group consisting of discrete translations 
and 180° rotations only. 


(c) A liquid crystal in which there is still com- 
plete translational symmetry, but the rotational 
symmetry is broken. Such a phase is called ne- 
matic. There is no positional order but there is 
orientational order. 


(d) This system is called a smectic. It is 
anisotropic, as it is made up of independent lay- 
ers in which the horizontal translational symme- 
try is still manifest, but in the vertical direction it 
is ordered. There is complete orientational or- 
der. 


(e) A rotational defect (a vortex) as it exists in 
an ordered spin system (represented by ordi- 
nary arrows), related to a rotation over an angle 
of 360° . This is observed if one follows the di- 
rection of the spin vector if one moves around 
the defect. 


(f) This is a rotational defect in a nematic of 
rods. It is called a ha/f-vortex as it corresponds 
to a defect angle of 180° . This defect is not pos- 
sible in a spin system like in (e), going around 
the direction of the spin arrow would point in the 
opposite direction. 


Figure II|.2.23: Nematics. Various types of two-dimensional order in a nematic system made up of rod-shaped molecules. 
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This structure is called a smectic. A third possibility (not 
depicted) is called a uniaxial nematic, where the rods are 
vertically stacked in thin filaments. In the direction of the 
filaments there is the translational order of stacking, but 
there is no horizontal order across the filaments. 


Defects. In Figures III.2.23(e) and III.2.23(f) we have de- 
picted two rotational defects, the first one is an ordinary 
point defect one may encounter in a two-dimensional spin 
system or vector field, but of course it can also exist in a 
nematic. The signature of the defect is that parallel trans- 
porting a vector along a closed loop around the defect the 
spin rotates over 360° as indicated in the figure. The last 
picture shows a ‘half-vortex’, and we see that the config- 
uration is smooth in spite of the fact that the rod rotates 
only over 180° when taken around. So it is a point-like 
defect. This configuration will not form in a spin system 
because there would necessarily be a discontinuity along 
a line starting from the defect and ending at the boundary. 
Such a line would cost much energy and that suppresses 
the formation. One way to look at this is to say that the 
half- vortices are ‘confined’ in the ordered two-dimensional 
spin system. Indeed, if one cools a spin or nematic liq- 
uid rapidly through the transition one usually finds many 
of the allowed point defects in the (partially) ordered sys- 
tem. 


We have illustrated the idea of liquid crystals with a very 
simple example, but it should be clear that there is an 
unlimited arsenal of variations and alternatives that has 
been very actively been pursued for example under the 
name polymer physics. As we mentioned Pierre-Gilles de 
Gennes of the College de France made many invaluable 
contributions to the early exploration and further develop- 
ment of this field of research. 


Quasicrystals 


Tilings of the plane. In Figure IIl.2.24 we have depicted 
some tilings of the plane by simple regular polygons.* It 
works perfectly for triangles, squares and hexagons, but 
with pentagons (Figure III.2.24(c)) it doesn’t quite fit and 
one cannot tile the plane. A consequence of this is that 
in the diffraction patterns there can be no signature of a 
fivefold symmetry. In three dimensions something similar 
happens, since it is not possible to fill space by stacking 
dodecahedra which do have fivefold symmetries. The Bra- 
vais lattices we discussed before do not admit any fivefold 
axes and therefore the diffraction patterns of periodic crys- 
tals can only have two-, three-, four-, and sixfold symme- 
tries and not have a fivefold symmetry. 


It was a big surprise therefore, when in 1982 the Israeli 
physicist Daniel Shechtman actually observed a clear dif- 
fraction pattern that appeared to come from a perfect crys- 
tal but nevertheless showed a manifest fivefold symme- 
try, like the pattern displayed in Figure Ill.2.26(c). How 
could that be? Could there be a nice Bragg diffraction 
pattern coming from some non-periodic structure? Yes 
indeed, it turned out that a nice but not perfect diffrac- 
tion pattern could be generated not only by a perfectly 
periodic, but also by a non-periodic structure. The sys- 
tem of Shechtman was clearly perfectly ordered, other- 
wise there would not be such a clear diffraction pattern, 
but could not be periodic, because that is incompatible with 
the fivefold symmetry. With his observations the new field 
of quasi(periodic)-crystals was born. Shechtman received 
the Nobel prize in Chemistry in 2011 for his remarkable 
discovery which caused a paradigm shift in the well-esta- 
blished field of crystallography. 


Non-periodic tilings. An instance of a quasi-periodic struc- 
ture is a tiling of the plane by two types of rombhi, de- 


“A regular polygon has equal angles and is equilateral. 
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(a) A triangular tiling of the plane. (b) A square tiling of the plane. 


(c) The plane cannot be filled with pentagons. (d) A hexagonal tiling of the plane. Adding the centers would make it a 
triangular lattice like (a) again. 


Figure III.2.24: Polygon tilings. Possible and impossible polygon tilings of the plane. The regular tilings have discrete translational 
and rotational symmetries plus reflection symmetries in certain planes. 
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(a) The projection method to obtain a non-periodic tilings. A strip is con- (b) All lattice points in the strip are projected on T producing a non- 
structed shifting the basic cell along the subspace T on which one wants periodic ‘tiling’ of the red line (space). 

to project, and the white segment (subspace) is the intersection of the 

strip with T+. 


(c) A pentagon filled with six smaller pentagons. Embedding six pen- (d) The Penrose tiling P1 . The tiling is self-similar and basically a fractal. 
tagons into a larger one can be repeated indefinitely, to generate the Translational invariance has been given up in favor of scale invariance. 
Penrose tiling P1. 


Figure II|.2.25: Non-periodic tilings. Non-periodic but scale invariant tilings of the line and of the plane. 
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picted in Figure Ill.2.26(a). This tiling has an approximate 
fivefold local symmetry. The mathematics of these non- 
periodic tilings has been developed by the British mathe- 
matical physicist Roger Penrose in the early 1970’s.° They 
are remarkable in that there is no translation which leaves 
the tiling invariant. They can have reflection symmetry and 
for example a fivefold rotation symmetry. But the Penrose 
tilings have another more subtle so-called scaling symme- 
try, which means that from any point in the tiling you can 
blow up or shrink the tiling by a certain amount and it will 
fit again. This means that such patterns are self-similar. 
they repeat themselves on larger and larger scales and 
are therefore a special kind of so-called fractals. 


A one-dimensional Fibonacci tiling. One way to obtain 
quasicrystals or quasi-periodic tilings is by projecting reg- 
ular periodic lattices from higher dimensions. We have il- 
lustrated this in Figure Ill.2.25. The top two pictures illus- 
trate the method of going from a simple two-dimensional 
square lattice to a non-periodic one-dimensional ‘lattice’. 
One first defines the ‘physical’ one-dimensional space T 
like the red line in the figures. In this example the line has 
a slope 2/(1 + v5), which is equal to the inverse of the 
Golden Mean. This slope is an irrational number which 
ensures that it will never go through a point of the lattice 
and that guarantees that the sequence is not periodic. The 
following step is to shift the two-dimensional unit cell along 
T, and this defines the light shaded strip along T. Next 
one projects all lattice points in the strip parallel to the or- 
thogonal subspace T+ on T and one gets a non-periodic 
covering of the line by line segments of only two distinct 
lengths, being the two different one-dimensional tile types. 
The sequence of short (s) and long (I) segments forms a 
so-called Fibonacci chain: sl, sll, slsll, sllslsll, .... 
Each next entry of the sequence is obtained by joining 
the previous two, which makes the sequence as a whole 
‘self similar’. Every finite sequence is repeated an infinite 


Penrose received the Nobel prize for Physics in 2020, not for his 
‘tilings’ but for ‘his discovery that black hole formation is a robust pre- 
diction of the general theory of relativity. 


number of times, but that does not imply that the chain 
is periodic. There is also an alternative way to construct 
the sequence through some ‘growing’ algorithm. This is 
a general method that can be used to generate any Pen- 
rose tiling, and is referred to as the substitution or inflation 
method. This is beyond the scope of this book and we will 
not discuss it in any more detail. 


The two-dimensional Penrose tiling P1. Let me now give 
you an idea how one can obtain a non-periodic tiling in 
two dimensions with a fivefold symmetry by the projection 
method. We start with a five-dimensional simple cubic lat- 
tice. This lattice evidently has a fivefold symmetry rotating 
about the diagonal of the hypercube, where the corners 
on the five coordinate axes are rotated into each other. 
This is just like the threefold axes of the three-dimensional 
cube depicted in Figure II.2.20(d). We choose the physical 
space as a plane that is orthogonal to the fivefold axis. We 
then move the hypercube over the plane to obtain a five- 
dimensional layer. All lattice points and edges in that layer 
can now be projected orthogonally on the two-dimensio- 
nal physical space, and then a tiling like the Penrose tiling 
P1 of Figure Ill.2.25(d) results. The figure shows that P1 
needs four types of tiles to fill the plane: the ‘pentagon’, the 
‘star’, the ‘boat’ (half star) and the ‘lozenge’. The tiling has 
an approximate ‘local’ fivefold rotational symmetry. 


The projection method allows us to generate all the two- 
and three-dimensional Penrose tilings. From the figure 
one may correctly guess that also the P1 tiling also can 
be constructed from a concentric ‘growing’ algorithm. Not 
surprisingly the topic of quasicrystals has given rise to a 
prolific mathematical literature. 


The projection method is due to Paul Steinhardt of the 
University of Pennsylvania, while the growing algorithmic 
approach was worked out in detail by the British mathe- 
matician John Horton Conway and Roger Penrose him- 
self. 
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(a) A quasi-periodic tiling of the plane with fivefold symmetry with two 
types of rhombi. The sharp angles of the rhombi are 72 and 36 degrees. 


(c) The diffraction pattern of a quasicrystal (the Al;6Mn alloy) having a 


fivefold symmetry. 


Figure II|.2.26: A quasicrystal with fivefold symmetry. 
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(b) The (Penrose) quasi-periodic tiling (P3) of the plane with a ‘local’ 
fivefold symmetry. It is possible to completely cover the plane by this 
arrangement with only two different types of tiles. 
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(d) The calculated diffraction pattern from a projected higher-dimensio- 
nal lattice, in a direction orthogonal to a fivefold axis (as in Figure (b)). 


Further reading. 
On condensed matter physics: 


— Introduction to Solid State Physics 
Charles Kittel 
Wiley (2004) 


Solid State Physics 
Neil W. Ashcroft and N. David Mermin 
Thomson Press (2003) 


Principles of Condensed Matter Physics P. M. 
Chaikin and T. C. Lubensky 
Cambridge University Press (1995) 


Modern Condensed Matter Physics 
Steven M. Girvin and Kun Yang 
Cambridge University Press (2019) 


On quasicrystals: 


— Quasicrystals: The State of the Art 
D.P. di Vincenzo, P.J. Steinhardt 
World Scientific (1991) 


— Quasicrystals and Geometry 
M.Senechal 
Cambridge University Press (1995) 


Chapter III.3 


The electron collective 


Bands and gaps 


Electron states in periodic potentials 


Two limits. If the nuclei are positioned on the sites of 
some regular cubic or hexagonal crystal lattice, the elec- 
trons no longer move in a spherical electric field of a single 
nucleus which would give rise to the atomic bound state 
orbits, rather the electrons experience a periodic electric 
potential due to the nuclei on the lattice. You may imagine 
some set of energy wells with a characteristic depth —Vo 
separated by a distance a. To get an idea of what may 
happen in this situation we can approach it from two sides 
as | indicated in Figure Ill.3.1. 


The first approach starts on the left-hand side where we 
assume that the separation a of the nuclei on the lattice 
would be large compared to the sizes of the electron clouds 
of the individual atoms. Then the electron states stay lo- 
calized around each atom and would maintain the typical 
atomic spectrum as given on the left. For a solid of N 
atoms each level would be N-fold degenerate. Now if we 
start making the separation a smaller, then at a certain 
point the clouds of neighboring atoms would start over- 
lapping, and the electrons would start feeling each other’s 
presence due to both their charge and the exclusion prin- 
ciple. This repulsion would deform the clouds and there- 


Energy 


Atomic 


Figure IlII.3.1: Energy levels, bands and gaps. For individual 
atoms (I), free electrons (r), and for a periodic lattice of ions (m). 


fore the energy levels would start to split. As a conse- 
quence energy bands of narrowly split levels start showing 
up in the spectrum as indicated on the diagram in the mid- 
dle. 


We could also approach the problem from the right-hand 
side where we start with Vo small. Then we would just 
have the spectrum of free electrons moving through space, 
and these can have any energy. In other words the spec- 
trum is continuous as indicated in the diagram on the right. 
Now letting the size of the potential barrier grow energy 
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Figure Ill.3.2: Position-momentum duality. In this figure we ex- 
plain the real space- momentum space duality in the case of a 
periodicity with period a in the potential for example. We know 
position space we have R = —co < x < co while the free parti- 
cle momenta are also unbounded R = —oo < k < œo. Working 
top down: (i) divide x-space up in identical pieces of size a which 
are periodic so we can think of them as little circles (ii) relabel 
the coordinates a map x = ja + ọa/2r. on a pair (~,j) where 
S! = 0 < ọ < 2r is the angular coordinate of a circle with ra- 
dius a/27, andj € Z an integer with —oo <j < co. 


gaps would open up and we would again end up with the 
spectrum in the middle. So coming from the left it is bands 
that form and coming from the right it is gaps that open 


up. 


Periodicity and the reciprocal lattice. Let us consider 
the one-dimensional case where the electrons will move in 
the periodic potential of the ions on a lattice. The periodic- 
ity implies an invariance of the potential under translations 
over the lattice distance a. And the electron wavefunctions 
will then carry certain representations of that symmetry. 
The fact that the potential is periodic does not mean that 
the wavefunctions themselves have to be periodic. The 
situation is similar to the case of the single atom where 


the potential is spherically symmetric around the nucleus, 
but the quantum states are generally not spherically sym- 
metric. They form representations of the rotation group 
labeled by the quantum numbers l and m. 


The situation we have depicted of the right hand-side of 
Figure III.3.1 is illuminating. Let us consider the free par- 
ticle limit of the spectrum and think of them as states in a 
periodic (though vanishing) potential. This perspective is 
visualized in Figure III.3.2 where the top and bottom half 
are dual to each other. We start in ordinary position x- 
space which in one dimension is just the real line R. We 
think of it as a periodic sequence of intervals of size a, the 
lattice distance. This means that we interpret periodic x- 
space as a product of a circle with circumference a and 
a infinite lattice A = Z with points x; labeled by an inte- 
ger j and where x; = ja. So we may now quantize the 
free particle on this product space, and try to recover the 
free particle spectrum on the real line, being a continuous 
spectrum —oo < k < on, as indicated by the real line 
at the bottom of the figure. The free particle quantization 
on the circle of radius a yields states that correspond to 
the discrete ‘reciprocal’ lattice A* = Z, labeled by set of 
integers {—co < n < oo} and corresponding k—values 
kn = 27mn/a. It is strictly analogous to the simple Bohr 
atom. The quantization of a discrete position lattice pro- 
duces states labeled by a continuous set of values q that 
form a circle, a periodic interval —7/2 < q < n/a. This 
fundamental domain of q—values is called the first Brillouin 
zone. Combining these plane wave quantum numbers we 
indeed recover the overall k spectrum by simply multiply- 
ing the individual exponential (wave functions) which leads 
to the identification: k = kn + q , corresponding to adding 
the exponents. 


The Brillouin zone. The procedure just outlined is actu- 
ally quite general, and works in any dimension. You start 
with an d-dimensional periodic lattice A in Rt where we 
basically identify the points of the x-lattice. This means 
that the space R can be thought of as a ‘product’ of a d- 
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(a) The lattice A in x-space. (b) The dual (reciprocal) latice A* in k-space. 


(c) The Wigner-Seitz cell. (d) The (first) Brillouin zon. 


Figure III.3.3: The real space lattice and the reciprocal wave vector lattice. 
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dimensional torus R/A times the lattice A. Free particle 
quantization gives then a part from the torus which yields 
the dual or reciprocal lattice A* and a part from the lattice 
which produces a particular dual torus. 


We have illustrated this explicitly for the two-dimensional 
case in Figure III.3.3. In the Figure (a) we highlighted the 
so-called periodic unit cell, where the symmetry group of 
/\ is generated by the two basic orange translation vectors. 
In fact there is an even smaller so-called fundamental do- 
main with which the whole plane can be tiled through peri- 
odic copying. This domain is highlighted in Figure (c) and 
obtained as follows. First we start at the origin, and con- 
nect it with all neighbouring sites (orange lines), then we 
draw the perpendicular bisectors (green lines) of the con- 
necting lines. These bisectors then enclose a fundamental 
periodic (closed) domain called a Wigner-Seitz cell. One 
easily verifies that this cell allows for a space-filling tiling. 
In figures (a) and (b) we show the construction of the dual 
lattice, the vectors in the lattices have to satisfy the duality 
condition: 


etkn Xi =] 5 ka E AN * Xj E A. (11.3.1) 


The basic translation vectors defining the reciprocal lattice 
Tı and Tz are obtained from the basic translation vectors 
tı and tz by the conditions t; - Tj = 276;;. The fundamen- 
tal domain of the dual lattice constructed in Figure (d) is by 
condensed matter physicists referred to as the (first) Bril- 
louin zone. The ‘Brillouin zone’ is the ‘Wigner-Seitz cell’ in 
wave-vector space. 


Electron wavefunctions: bands and gaps. Let us return 
to the one-dimensional case, and look at the states in the 
free particle limit as we have depicted in Figure III.3.4. We 
have plotted the energy as function of the momentum, or 
the dispersion E = E(k), but we reduced the k-value by 
some dual lattice vector 27n/a, as to bring it in the Bril- 
louin zone. In other words we plot E(k) = En(q), and that 
is in fact what is shown on the right-hand side of Figure 
Ill.3.1, and in the parametrization given in the lower half of 


Figure III.3.4: The Brillouin zone. We have plotted the energy 
as function for momentum for free electrons (in one dimension) 
but have shifted the momentum by an integer times the smallest 
reciprocal lattice vector kı = 27/a as to bring it in the Brillouin 
zone —1/a < k < 7t/a, the white colored region. The horizontal 
axis is the momentum axis, along the vertical axis we have put 
the electron energy E = En (q). 


Figure III.3.2. So indeed, the free electrons can have any 
energy En(q) < 0. Note that in the resulting spectrum the 
levels fold over at the boundaries (q = +7t/a) and cross 
in the middle where q = 0. If we return to Figure III.3.1 we 
have argued why by increasing the nuclear potential the 
continuous spectum will break up, and gaps will open up 
as depicted in Figure III.3.5, exactly for the special values 
of q as indicated. Let us now after this introduction move 
on to the generic spectrum of the quantum electron fluid in 
an ordinary solid. 


Valence and conduction bands. In Figure Ill.3.6 we give 
the band structure in the periodic potential landscape of 
the lattice in which the electrons live. The landscape is 
characterized by the interatomic distance and the height 
Vo of the potential barrier. The electrons fill the bands to 
a certain maximum level which is called the Fermi level, 
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Figure III.3.5: Gaps open up. Gaps open up where dispersion 
curves cross the boundary of the Brillouin zone, or where they 
intersect. Even though the states will be deformed, the label n of 
the previous figure remains the label for two successive bands. 


marked by the white dashed line. The two bands clos- 
est to the Fermi level are called the valence band and the 
conduction band and as we will see the properties of the 
material will depend strongly on where these bands are 
located with respect to the Fermi level. The inner electron 
bands below the valence band consist of pretty much local- 
ized states. The allowed states in the ‘conduction’ bands 
are not localized but extended, which means that electrons 
move anywhere in the sample. 


Conductors and insulators. How the electrons in the 
solid collectively behave strongly depends on the position 
of the Fermi surface, which Figure Ill.3.7 demonstrates. 
If the Fermi level is in the middle of the valence band, 
the electrons can move easily because there will be many 
states available with some more energy, and the material 
is therefore a conductor for electric currents. If the valence 
band is completely filled and there is not energy enough 
to enter the conduction band, the electrons cannot move 
and we are dealing with an insulator and we say that the 


Conduction band 


Inter-atomic distance 


Figure III.3.6: Electron bands in a crystal. In a crystal the en- 
ergy levels are all filled up to the Fermi level (dashed line). The 
two bands closest to the Fermi level are called the valence and 
conduction band. The periodic potential is characterized by the 
interatomic distance and the height of the potential barrier Vo . 


medium has an energy gap — is gapped. The interme- 
diate case of a semiconductor deserves a section of its 
own. 


Semiconductors. 


Finally we can imagine that the energy gap between va- 
lence and conduction band is narrow, so that not much en- 
ergy is needed to excite electrons into the next band. This 
is typically the situation in a semiconductor. The image 
on the right in Figure III.3.7 shows a narrow band gap of 
a semiconductor at room temperature. The coloring indi- 
cates that because of the thermal energy some electronic 
states at the bottom of the conduction band will be occu- 
pied leaving some holes in the valence band. In the next 
figure we show again the typical energy landscape of what 
is called an intrinsic semiconductor, with the two bands 
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Figure III.3.7: Energy bands. The admissible energy levels for 
the electrons form the valence and conduction band, where the 
Fermi level is marked by the dashed white lines. We distinguish 
a conductor (|) where there is basically no gap, an insulator (m) 
where the valence band is filled and there is a big gap, and 
a semiconductor (r) with a narrow gap. The filled states are 
colored orange and empty states blue. 


and the Fermi level right in between. The electron/hole 
density in equilibrium is determined by the energy differ- 
ence between the (conduction/valence) band edge and the 
Fermi level, which means that as E- = E} = Eg/2 the 
number of charge carriers n+ is exponentially suppressed 
by a Boltzmann factor exp(—Eg/2kT). But this also im- 
plies that its dependence on the energy gap is exponential 
and that fact is exploited in the idea of doped semicon- 
ductors on which all basic semiconductor devices such as 
transistors are based. 


Semiconductors like silicon are at the heart of all modern 
information storing and processing devices. It is not by ac- 
cident that the Californian cradle of the information revolu- 
tion we have witnessed is called ‘Silicon Valley’. And it was 
because of the ever smaller scales at which the semicon- 
ductor switches (transistors) could be implemented and 


Intrinsic semiconductor 


Conduction band 


Eg narrow band gap 


Valence band 


Figure III.3.8: The intrinsic semiconductor. The intrinsic semi- 
conductor, is characterized by a narrow gap between valence 
and conduction band, with the Fermi level exactly in between. 
The horizontal axis is the space axis, along the vertical axis we 
have put the electron energy. 


exploited that the spectacular large-scale integration of pro- 
cessor and memory chips became possible. 


A doped semiconductor. The possibility of doping, al- 
lows you to somewhat customize the energy landscape 
in semiconductors. What one does is to replace a cer- 
tain percentage of the silicon atoms in the lattice by either 
phosphorus (P) or boron (B) as indicated in Figure III.3.9. 
In the periodic table phosphorus is the right-hand neighbor 
of silicon and therefore provides an extra electron, which 
makes the material somewhat more negatively charged. 
The effect is to basically lower the band energies with re- 
spect to the Fermi level. Substituting with boron has the 
opposite effect, as boron sits in the column to the left of 
silicon, and therefore has one valence electron less; the 
semiconductor will have an excess of positive charges or 
holes. One may also dope the opposite sides of a semi- 
conductor differently, in which case one gets a pn-diode or 
pn-junction, as we have depicted in Figure IlI.3.10. Now 
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Figure III.3.9: Doped semiconductor. We can replace a certain 
fraction of the silicon atoms in the lattice by either phosphorus 
(P) or boron (B). The former yields an excess of negative charge 
carriers (electrons), called n-doping, whereas the latter leads to 
an excess of positive charge carriers (holes), called p-doping. 


in addition to the band gap Eg , a new energy scale Ep is 
introduced by the doping: on the left side we have many 
electrons and on the right side only a few, because there is 
a relative suppression factor exp (—Ep/kT) . For the holes 
the story is just the opposite, many holes on the right and 
few on the left. In the middle in the so-called depletion 
layer there are neither free charges nor free holes, it acts 
as an insulating layer. The Fermi level is the same on both 
sides, as you can always briefly shortcut the external wires 
till this equilibrium is established. 


Two semiconductor devices. This pn-diode is a simple 
and useful semiconductor device. Let us briefly indicate 
two applications without going into much detail. 


The photo-voltaic cell. The first possible application is to 
make a photo-voltaic cell which basically turns solar radia- 
tion in the form of photons into electron hole pairs by just 
exciting electrons from the valence band to the conduction 


n-doped — — 


Conduction band 


Valence band 


Figure III.3.10: pn-junction. By doping a semiconductor we 
can shift the band structure. With an excess of negative charge 
carriers (n-doping) we lower the bands, whereas with an excess 
of positive charge carriers (p-doping) the the bands move up in 
energy. In the figure you see the band profile of a np-doped 
semiconductor or a pn-junction. 


band. This is illustrated in Figure Ill.3.11, and amounts to 
creating an opposite charge excess on both sides of the 
device. In other words creating a voltage difference be- 
tween the two external plates. Clearly if we couple enough 
of them in a big array, we can generate high voltages and 
big currents. And this is a common way to convert solar 
radiation into electric power. The challenge is to make the 
efficiency large enough, so light has to be able to enter 
the semiconductor sufficiently as to maximize the absorp- 
tion. 


The Light Emitting Diode (LED). In Figure Ill.3.12 we show 
what happens if we connect the leads to a battery where 
we introduce a third independent energy scale Eg = eV,. 
The battery induces an energy (voltage) difference corre- 
sponding to Eg between the left and right Fermi levels. 
These levels split near the depletion layer. One can imag- 
ine what happens, the negative lead pushes the electrons 
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Solar photo-voltaic cell 


Conduction band 


Valence band 


Figure III.3.11: Photo-voltaic (Solar) cell. If we have a transpar- 
ent np-doped semiconductor, light (photons) can be absorbed 
by the electrons in the valence band and be excited to the con- 
duction band leaving a hole behind. So a voltage will build up 
over the cell and a current can flow. 


from the left towards the junction, and similarly the positive 
lead will push more holes in the system from the right. The 
effect is that the depletion layer becomes narrower and in 
fact if the voltage is high enough you will get a current of 
electrons and holes through the junction. However, as in 
a stationary state, the relative charge densities between 
the right and left have to remain exponentially different. 
What happens is that in the middle region the electrons 
and holes will recombine and that produces radiation that 
may be absorbed in the material, but of course it is also 
possible to implement this in a way that the radiation in the 
form of photons escapes, and we have a LED. It is a clear 
advantage that the energy is directly converted into elec- 
tromagnetic energy, not by heating a wire which in turn 
starts radiating. Voltages and currents can therefore re- 
main quite low as long as a sufficient percentage of re- 
combined pairs results in visible photons. At present the 
differences are quite stunning: the LED has a lifespan that 
is about a factor 50 higher than that of an incandescent 


Light emitting diode (LED) 
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Figure III.3.12: Light emitting diode (LED). Tne LED is more 
or less the converse of the photo-voltaic cell, in that we now ap- 
ply a voltage over the semiconductor, which changes the Fermi 
level on the negative/positive sides. This leads to a recombina- 
tion of electrons and holes in the center region of the junction 
producing light. 


bulb, while it costs about a factor 30 more. It is the energy 
consumption that makes the big difference, because that 
provides an additional factor of 60. This means that over 
the lifetime of an LED your yearly electricity bill would be 
reduced by a few hundred euros/dollars! These numbers 
also underscore the relative waste in the form of heat that 
is produced by the old-fashioned light bulb. 


Superconductivity 


Phonons. It is exciting to go one step deeper into possi- 
ble scenarios for the collective behavior of the electrons. 
Looking more closely at the lattice, we know that the nu- 
clei cannot be completely fixed at their positions on the 
lattice. They are subject to quantum and thermal fluctu- 
ations and these lattice fluctuations lead to waves propa- 
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Figure III.3.13: Superconductivity. The discovery of supercon- 
ductivity, as the measurement of a sudden dramatic drop in re- 
sistivity of solid mercury, was made in 1911 in Leiden by Heike 
Kamerlingh Onnes. It took more then fifty years before a funda- 
mental understanding of this phenomenon was achieved. 


gating through the lattice, which are just the familiar sound 
waves as a matter of fact. In the quantum perspective 
these waves are considered to be quasi-particles which 
are called phonons. So where photons are complemen- 
tary to light waves, so are these phonons complemen- 
tary to sound waves, and because sound only propagates 
through a material medium these quasi-particles are not 
really fundamental, they are quantized collective excita- 
tions of the underlying medium. 


Cooper pairs. Now the oscillating nuclei are charged and 
we should expect that these waves interact again with the 
electrons. In particle language the phonons will couple 
to the electrons. And the interesting feature of these in- 
teractions is that they lead to an effective attractive force 
between the electrons. In other words the ‘phonons’ be- 
come the carriers of an attractive force between the elec- 
trons. What happens is interesting, close by the elec- 
trons are repelled because of their charge, but that re- 
pulsion is screened on larger distances and there the at- 
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Figure lll.3.14: High temperature superconductivity. The max- 
imum temperature at which superconductivity takes place has 
increased dramatically during the last quarter of the 20th cen- 
tury, but appears to have stabilized again. A fundamental under- 
standing of the underlying mechanism, however, is still lacking. 


tractive force due to the phonons becomes dominant and 
creates bound states of electrons, the electrons pair up 
and form so-called Cooper pairs. At low temperatures you 
may think of the Fermi surface as a sphere in momentum 
or k-space with well defined radius kr. A Cooper pair is 
formed by two electrons at opposite points of the sphere, 
where furthermore the electrons have spins pointing in op- 
posite directions. In Figure Ill.3.15 we have indicated the 
Fermi sphere with two Cooper pairs at the surface, each 
pair bound through the exchange of a virtual phonon. So 
we should think of the electron collective no longer as a 
community of singles but of couples and once more that 
strongly affects the states that are allowed just as in our 
earlier societal analogue. 


The superconducting ground state. | have already re- 
ferred to the spin of particles and the Pauli exclusion princi- 
ple, which decrees that two half-integral spin particles can- 
not occupy the same state whereas integral spin particles 
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Figure III.3.15: Cooper pairs. Cooper pairs are bound states of 
two widely separated electrons caused by the exchange of vir- 
tual phonons. This turns the electron collective effectively into a 
gas of charged bosons which then condense into the supercon- 
ducting BCS state. 


can. But after the electrons pair up, we are no longer deal- 
ing with a collective of spin 1/2 electrons, but with pairs of 
electrons with opposite spins, which means that the pairs 
have spin zero. And that has dramatic consequences: 
whereas the electrons cannot sit in the same state and 
push each other to ever higher and higher energy states, 
the charged bosonic pairs all can sit in the same lowest 
energy state. So you can imagine that there is an enor- 
mous energetic advantage for the system of electrons to 
pair up and all ‘condense’ in the ground state. Well, it 
does happen, and we see that for certain conductors, if we 
cool them down sufficiently, the pairs can form and con- 
dense into a surprising new state of matter: the material 
becomes superconducting. A superconductor is a conduc- 
tor with the miraculous property that it conducts electricity 
with absolutely zero resistance! The most dramatic fact is 
maybe that this phenomenon is a macroscopic manifes- 
tation of quantum theory, the superconducting state is a 
macroscopic quantum state. This is possible because all 


the Cooper pairs have condensed into a single quantum 
state. 


Bose-Einstein condensates. These kind of condensa- 
tion effects are a manifestation of Bose-Einstein conden- 
sation an effect predicted as early as 1924 by the Indian 
physicist Satyendra Nath Bose and Albert Einstein. And 
indeed many other examples have since been found: for 
example He’ is a boson and therefore can condense at 
very low temperature in a state that exhibits the amazing 
property of superfluidity. As we discussed in the previous 
chapter, there is no viscosity in a superfluid: another one 
of these quantum miracles which would be inconceivable 
from a classical point of view. The Bose-Einstein conden- 
sates which have been observed in diluted atomic gases, 
and for which the Americans Eric Cornell, Carl Wieman 
and Wolfgang Ketterle received the Physics Nobel prize in 
2001, are another recent discovery. These condensates 
are close to the theoretical setting described in the original 
papers of Bose and Einstein. 


Some history. We have made a small tour d’horizon to 
give you a sense of how rich and surprising the macro- 
scopic behavior of a collective of atoms may be, and how 
intricate the balances of forces are, and to what kind of 
exotic properties of materials this may lead. It also shows 
how creative one has to be to get to a detailed physical un- 
derstanding such exotic properties. It is worth pointing out 
that superconductivity was discovered by Heike Kamer- 
lingh Onnes in Leiden as early as 1911. He found that 
the resistance of solid mercury immersed in liquid helium 
suddenly dropped to zero at a temperature of 4.2 K, as 
shown in Figure III.3.13. The story goes that he gener- 
ated a persistent circular current and managed to take it 
along to Amsterdam to show it to his colleagues over there! 
Kamerlingh Onnes received the Nobel prize in Physics in 
1913 for ‘his investigations on the properties of matter at 
low temperatures which led, inter alia, to the production of 
liquid helium. 
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Figure III.3.16: Magnetic levitation. A little magnet will be lifted 
above a superconductor, because of the Meissner effect, which 
means that magnetic field lines are expelled from a supercon- 
ducting region. The aura of magic is caused by the boiling liquid 
nitrogen needed to cool the high-temperature superconductor. 
(Source: Michigan State University.) 


The microscopic mechanism underlying superconductivity 
remained a complete mystery for a long time. The Rus- 
sian physicists Lev Landau and Vitaly Ginzburg proposed 
an effective field theory explaining quite a lot of the phe- 
nomenology of the superconductors, but it was not until 
1957 that the fundamental quantum mechanism including 
the pair formation and the precise structure of the super- 
conducting ground state was put forward by the American 
physicists John Bardeen, Leon Cooper and Robert Schri- 
effer, who received the Nobel prize for their groundbreak- 
ing work in 1972. This splendid theory is known as the 
BCS theory of superconductivity. 


Ever since the ‘BCS’ breakthrough in the understanding of 
superconductivity there has been a host of detailed quan- 
tum mechanical explanations for the highly surprising ways 
collectives of atoms may behave and turn into molecu- 
lar gases, liquids, glasses, liquid crystals, magnets, su- 


perconductors or Bose-Einstein condensates, or even as- 
semble into large molecules, all depending on the param- 
eters of the theory. Again, this is the branch of physics 
which Philip W. Anderson, the celebrated American con- 
densed matter theorist who died in 2020, characterized 
by the credo ‘more is different’, referring to the splendid 
diversity of collective quantum behavior that emerges in 
macroscopic systems consisting of many interacting con- 
stituents. We have emphasized that the differences cannot 
always be traced back to the differences in the constitu- 
ent particle types. Though the type of interactions these 
have is absolutely crucial, the macroscopic phase that is 
realized may also depend on external parameters, like the 
temperature, the density, the presence of a magnetic field 
and so on. To conclude we may say that in trying to under- 
stand and predict the splendid diversity of emerging prop- 
erties, quantum reasoning has become absolutely indis- 
pensable. 


The Meissner effect. You might wonder what happens if 
we apply a magnetic field to a superconductor. This is an 
interesting question to ask because we know that a con- 
ductor tends to counteract a change in the magnetic field, 
which means that currents are generated which are such 
that they generate a field in the opposite direction. Now 
you can imagine that because there is no resistance in the 
superconductor these currents will keep running thereby 
permanently counteracting the change in magnetic field. 
The net result is remarkable: magnetic fields cannot pen- 
etrate a superconductor! This expulsion of magnetic fields 
fromn superconducting regions is called the Meissner ef- 
fect, after the German physicist Walther Meissner who dis- 
covered it in 1933. 


Here some qualifications have to be made though. The 
first is that if we keep increasing the magnetic field we end 
up breaking the pairs and the superconducting phase is 
destroyed. The second is more interesting, and follows 
because the electrons (and pairs) have a funny property. 
It turns out that they cannot detect a specific amount of 


534 


CHAPTER III.3. THE ELECTRON COLLECTIVE 


magnetic flux. What happens in the so-called Type II su- 
perconductors is that the magnetic flux can enter the su- 
perconductor as long as it is in quantized portions the elec- 
tron pairs can’t see. In other words there is a minimal 
unit of magnetic flux Dp that is compatible with a con- 
densed charge q and it is given by the simple relation 
Oo = 27h /q . 


In a three-dimensional superconductor these magnetic flux 
lines that enter the superconductor line up parallel to the 
direction of the external magnetic field. However the flux 
lines repel each other and therefore if you increase the 
strength of the field and look in a plane perpendicular to 
the field you see that they tend to form a nice triangular 
lattice. | should point out an additional or better comple- 
mentary view on this situation. The fact is that in the core 
of these magnetic filaments the medium becomes a nor- 
mal conductor again. So in a sense you can say that the 
magnetic field did not enter the superconductor after all 
but corresponds to filaments of a normal conductor in the 
superconductor. 


| have all along been emphasizing the use of symmetry 
arguments. What about the superconducting phase, are 
they of any use there? The answer is affirmative. Though 
the argument is somewhat more complex. We all know 
that electric charge is conserved: you cannot lose an elec- 
tric charge; it may be transferred from one fundamental 
particle to another, for example in reactions like proton + 
electron goes to neutron etc. We have mentioned in previ- 
ous chapters that this conservation law is a consequence 
of the internal symmetry called gauge invariance. 


But if, like in the superconductor, the groundstate is filled 
with electrically charge particles, then the electric charge is 
no longer conserved, you can change it by arbitrary mul- 
tiples of 2e without changing the physical situation. The 
point is that the superconducting state is unusual in that 
there is no definite number of electrons or pairs in that 
state. So the story here is that in the superconducting 


phase charge is no longer conserved because the gauge 
symmetry is broken. But if a symmetry is broken then we 
have to ask whether there are not defects that we have to 
take into account. Yes indeed, the defects are precisely 
the magnetic vortex lines we have been discussing. The 
symmetry breaking story once more fits exactly the phe- 
nomena observed. 


The quantum Hall effect 


In the phenomenon of superconductivity we have seen one 
of the more subtle ways the system of a rigid lattice can in- 
teract with the gas of electrons and give rise to a rather 
surprising form of collective behavior. Are there other ex- 
amples of interactions electrons may engage in that dras- 
tically change their collective behavior? | wouldn’t ask you 
if the answer wasn’t yes. A stunning example is the so- 
called quantum Hall effect: it occurs just like supercon- 
ductivity and superfluidity only at temperatures of a few 
Kelvins so that its applications have been limited so far. 
The setting for the quantum Hall effect is a two-dimensio- 
nal conductor (imagine for example a conducting bound- 
ary layer between two insulators) where we apply a strong 
magnetic field perpendicular to the surface. This situation 
is depicted in Figure III.3.17(a). 


The physics in this setting is rather counterintuitive. Imag- 
ine a little slab of quantum Hall medium and applying a 
voltage difference V in the x direction. In a normal con- 
ductor a current I would start flowing in the x-direction 
according to Ohm’s law decreeing that I = V/R so, in- 
versely proportional to the resistance R. In the quantum 
Hall medium however, the current starts flowing in the y 
direction, perpendicular to the applied field! This is even 
the case in classical physics as Edwin Hall already discov- 
ered in 1879. The transversal Hall resistance as a function 
of the applied magnetic field (with fixed current) is plotted 
in Figure Ill.3.17(b). We talk about a transversal or Hall- 


THE QUANTUM HALL EFFECT 535 


Zz 
“4 
PA 
w a 
Z a 
2 
; x 
s4 10 
A E 
Ka m 
S 2 
f g 
a 
o 5 
- 
T 
I 
0 
0 2 4 6 8 


Magnetic field B [Tesla] > 


(a) The quantum Hall setup. Driving a current through a planar con- (b) The classical Hall effect shows a linearly rising V} (blue line) as a 
ductor with a strong magnetic field B orthogonal to the plane yields a function of the applied magnetic field, while keeping the current constant 
transversal potential Vn . (green line). 
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(c) The integer quantum Hall effect showing the plateaus with integer v (d) The fractional quantum Hall effect with from right to left the plateau 


values. values for v = 4, 2, 3, $... 


Figure III.3.17: From the classical to the quantum Hall effect. 
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resistance (p), and a Hall-conductivity o = 1/p. 


The integer quantum Hall effect. To consider this system 
quantum mechanically, there are two things that we ought 
to understand. The first question is the behavior of a single 
electron in a magnetic field, and the second is the collec- 
tive behavior of electrons in this setting. It is beyond the 
scope of this book to drag you through the beautiful rea- 
soning, but even if we had done so, the phenomenon re- 
mains quite puzzling and counterintuitive. Where accord- 
ing to classical physics the transverse conductivity must 
grow linear with the applied field, in reality it does not! If 
you increase the magnetic field the conductivity remains 
constant over certain intervals and the value of that con- 
ductivity is strictly quantized according to the surprisingly 
simple relation o = vne*/27h, where v is the filling frac- 
tion which is defined as the electron density ne divided 
by the magnetic flux density ng in fundamental flux units 
(Do27fi/e): in other words ng = eB/27h. We have plot- 
ted plateaux in the Hall resistance for the integer effect in 
Figure Ill.3.17(c). What you see is that as a function of the 
applied magnetic field it has plateaus where it stays con- 
stant until it jumps to the next plateau (with lower n). 


The fractional quantum Hall effect, When you turn up 
the magnetic field to large values like 30 Tesla, plateaus 
also show up for fractional values of v like 3, 4,3, 4,..., in 
which case we speak of the fractional quantum Hall effect 
as depicted in Figure Ill.3.17(d). In the fractional quantum 
Hall effect we have the unusual situation that the charge 
carriers in the medium are no longer electrons. Rather 
they correspond to localized collective excitations of the 
system which carry fractional electric charges, such as 
e/3 or e/5 depending on which plateau you are. 


So, to put it in more pictorial terms: if | would add an elec- 
tron to a quantum Hall system it would ‘fall apart’ in a set of 
fractional charges as displayed in Figure III.3.18. However, 
you should not think of these charge carriers as some kind 
of special ‘quark-like’ particles that make up an electron. 


Electron: 
q=e 


/@ © J 


Figure III.3.18: The quantum Hall fluid. Putting a 2-dimensio- 
nal free electron gas near absolute zero in a strong magnetic 
field one obtains a quantum Hall fluid. Adding a single elec- 
tron charge to the quantum Hall fluid, the charge will fractional- 
ize into three anyons each with charge e/3. These anyons are 
quasiparticles, and are in fact flux-charge composites carrying 
a exotic spin value s = qDo/2m = e/3 -ħ/e =h/3. 


No, these fractional charges are carried by well-localized 
collective excitations, special modes of the electron field in 
the presence of the magnetic flux. So, these collective ex- 
citations are not only charged, they also carry a magnetic 
flux quantum along with them. The flux quanta are in that 
sense the magnetic defects we saw in the type II super- 
conductors and which become particle like in a plane or- 
thogonal to the magnetic flux, but now these flux particles 
are dressed with a fractional electric charge. Such dually 
charged excitations that basically can only occur in two 
dimensions are called anyons. We have discussed such 
flux-charge composites in the section on spin and statistics 
on page 405 of Chapter II.5, and more specifically the sub- 
section on two-dimensional exotics on page 416. There 
we showed that such composites may indeed exhibit not 
just fractional charge, but also fractional spin and statistics 
properties. For the case where the basic anyonic charge 
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corresponds to q = e/3, we demonstrated that the spin of 
the anyon corresponds to s = qDo/2m =ħ/3. 


Quantum Hall systems are of fundamental interest because 
they represent truly novel states of matter, the existence of 
which nobody had anticipated. The integer quantum Hall 
effect was discovered by the German physicist Klaus von 
Klitzing in 1980, for which he received the Nobel prize in 
1985, The more complicated fractional quantum Hall ef- 
fect, featuring the fractional charge and exotic statistics 
properties was discovered in the early 1980s and a No- 
bel prize for theory and experiment was awarded in 1998 
to Robert Laughlin, Horst St6rmer and Daniel Tsui. 


Topological order 


Quantum Hall conductors constitute entirely novel states of 
matter, fundamentally different from the more familiar con- 
ducting phases, like ordinary conductors, semi- or super- 
conductors, which are usually referred to as Fermi liquids. 
From 1980 onwards many phases which exhibit similar un- 
usual behavior have been discovered; these phases which 
are characterized by certain non-trivial topological interac- 
tions are now considered manifestations of a generic prop- 
erty called topological order. It concerns phases which are 
gapped, which means that there are no massless degrees 
of freedom in the system, the relevant degrees of freedom 
are massive like the anyons, and these have topological 
long range interactions leading to their non-trivial spin and 
statistics properties. 


Quantum statistics. The term anyon was coined by the 
American physicist Frank Wilczek because these fraction- 
ally charged particle also have an exotic type of quantum 
statistics properties. We have emphasized the essential 
difference between bosons and fermions, where the lat- 
ter obey the Pauli exclusion principle saying that no two 
fermions can sit in exactly the same state whereas bosons 


can. Another way of saying this is that if we consider a 
multi-particle state and we interchange two identical type 
particles then the phase of the state may change. In three 
or more dimensions, if we repeat the interchange opera- 
tion, denoted by t , we are back to the original state, so that 
implies that t? = 1, which means that the phase change 
has to equal t ~ +1. If we interchange two bosons the 
state remains unchanged t = 1 and if we interchange two 
fermions the state changes sign so t = —1. We have al- 
ready pointed out that this difference in statistics (we call 
it statistics because the rule affects the way the particles 
can be distributed over the availlable states) accounts for 
the crucial differences in properties in many body systems. 
We recall the essential role of the Pauli exclusion principle 
in understanding the spectrum of atoms with more than 
one or two electrons. 


Braid statistics. The anyons that occur in two-dimensio- 
nal topologically ordered media satisfy a type of statistics 
referred to as braid statistics, where there is an essential 
phase difference between interchanging particles clock- 
wise or counterclockwise. So to calculate the state after 
some time you have to keep track of how often and in what 
direction the particles have moved around each other. One 
has to deal with the braid of particle world lines in space- 
time. And to know the state exactly you have to know the 
braid. A braid is much like a knot, and the theory of knots is 
a well-studied subject in the topology of three-dimensional 
manifolds. If we have a particular braid of five differently 
colored strands, we could connect the corresponding in- 
coming and outgoing strands to obtain a closed knot made 
of five strands. It is topological because it doesn’t matter 
at what distance the world lines wind around each other 
and moreover we may move the strands around and de- 
form the knot; but as long as we don't cut the strands the 
knot remains topologically the same. The knot will be char- 
acterized by a number of topological invariants. In terms 
of the quantum Hall effect this means the way the quan- 
tum state changes only depends on who danced around 
who and in what order. Another way to say this is that the 
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multi-anyon states exhibit long range entanglement. 


All possible braids can be composed of elementary moves 
of moving neighboring pairs around each other. The set 
of all such intertwining operations forms again a group, 
and the mathematics of such groups is well understood. In 
higher dimensions one only can have bosons or fermions, 
exactly because winding the paths clockwise or anti clock- 
wise is topologically equivalent, while in two dimensions 
there is in principle an unlimited number of topologically 
inequivalent windings possible and therefore also for the 
quantum statistics of states. It is even possible that differ- 
ent particle types exhibit non-trivial mutual statistics prop- 
erties. This means that the phase of the states may change 
after moving a particle around another type of particle: in 
other words applying 1 to a pair of particles belonging 
to different species. In general the multi-anyon states are 
formally classified as unitary representations of the braid 


group. 


Topological field theory. You may wonder what the theo- 
retical models look like that effectively describe these topo- 
logically ordered phases like the quantum Hall fluid. A 
large and important class of models, but not the only type, 
are so-called topological field theories. In particular the 
(2+1)-dimensional Chern-Simons theories. It is an effec- 
tive theory, which describes the phenomenology of the topo- 
logically ordered phases to a certain extent. And one has 
to realize that a derivation of this theory from first princi- 
ples is hard. To give you a flavour of what such theories 
look like, | show a basic example that is provided by just a 
(charge q) current ją = coupled to a gauge field A, that is 
described by a U(1) Chern-Simons theory. The equations 
in relativistic notation are actually quite simple and given 
by: 


À _— 30 — 2p 
shi2=) >B=5 


SE = ju 


A f 
= Fv = Euvoj? => { 


5 (11.3.2) 


where the parameter A is the coefficient of the Chern Si- 
mons term, which dependent on the setting will be quan- 


tized as well. In the quantum Hall effect it is directly linked 
to the quantized plateaux conductivity. What these equa- 
tions imply becomes clear if we look at simple situations: 


(i) If there is no charge or current, the equations say that 
there is no field: this is an expression of the fact that there 
is a gap, and there are no gauge field quanta maybe be- 
cause they are too heavy to be excited. In other words 
the pure Chern Simons theory has a ‘gauge field’ but that 
field does not describe local field degrees of freedom like 
photons. It is a purely topological theory, meaning that the 
only physical observables are the path dependent phase 
factors corresponding to closed loop integrals of the gauge 
field A, . 

(ii) If there is a single charge at rest (jo 4 0), we see from 
top equation on the right that the charge gets ‘dressed’ 
with a magnetic flux (Fiz 4 0), or the other way around 
a given flux quantum may attract charge and thereby cre- 
ating a dually charged anyon. Integrating the charge dis- 
tribution one obtains the relation between the flux ® and 
charge q of the anyon, ® = 2q/A. This in turn means that 
if two of those anyons encircle each other one obtains a 
phase factor exp(—iq?/A) , which can take all kinds of val- 
ues. 

(iii) the second equation describes the effect of applying a 
voltage across the sample; the resulting (Hall) current is 
perpendicular to the electric field. We see that this Chern- 
Simons term induces exactly the properties we have de- 
scribed before. 


Chern-Simons theory. The Chern-Simons theories are 
playing a fundamental role in modern physics and math- 
ematics. The American mathematical physicist (and out- 
standing string theorist) Edward Witten from the Institute 
for Advanced Study in Princeton, recognized its relevance 
for three dimensional topology and the associated physi- 
cal phenomena. In 1983 he noted that the Chern Simons 
action provides an intrinsically three-dimensional definition 
of knot invariants, and as one is free to choose the gauge 
group, it defines an infinity of them. For this work he was 
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awarded the Fields medal, the mathematical equivalent 
of a Nobel prize, in 1990. Secondly, Witten showed that 
if we look at the theory on spaces with a boundary, the 
theory can be entirely described by an equivalent (1+1)- 
dimensional conformal invariant field theory on the bound- 
ary which is a striking example of the holographic princi- 
ple we discussed at the end of Chapter ?? in the con- 
text of black holes. Finally, Witten also showed that Ein- 
stein’s theory of gravity in three space-time dimensions is 
actually a Chern-Simons theory where the gauge group is 
the group of local translations and Lorentz transformations. 
This provides an exciting laboratory to explore the ideas of 
the holographic principle etc. And as we emphasized in 
this chapter, topological field theory has bedome an indis- 
pensable tool for the description and understanding of a 
wide variety of topologically ordered phases in condensed 
matter. 


Topological quantum computation. These topological 
systems can be characterized by certain symmetries which 
are quite hidden, and an example of what are nowadays 
called quantum groups or Hopf algebras. There is a rapidly 
growing interest in this field of topologically ordered me- 
dia and more recently also materials called topological in- 
sulators, which exhibit topological order in three dimen- 
sions. These media appear to be quite ideal candidates 
for quantum information storage and processing, exactly 
because one can change the state by moving particles 
around each other. Loosely speaking a computation is 
nothing but a particular complicated braid or knot of a ‘reg- 
ister’ of anyons in space-time. It is an intrinsically fault tol- 
erant way of doing quantum computations because topo- 
logical moves are insensitive to local perturbations, that is 
perturbations caused by local interactions and that is all we 
have been talking about. No surprise therefore that many 
think of this as a development of great significance. And by 
many | not only mean scientists, but also security bosses 
of public organizations and others who have to hide big 
$ecret$ behind huge numerical keys which were once be- 
lieved to be unbreakable, but not in the future. Just wait for 
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Figure III.3.19: The quantum critical point. A quantum critical 
point separates at zero temperature an ordered (antiferromag- 
netic) and a disordered phase (a heavy fermion metal). For fi- 
nite temperature it opens up a region of quantum critical phases, 
such as what are called ‘strange metals. 


quantum technologies to come and get them. 


Quantum critical points. Figure III.3.19 shows the phase 
diagram of what is called a strange metal, which is charac- 
terized by an anomalous quantum critical phase in which 
the electrical resistivity varies linearly with temperature. 
This behavior shows up not only at a singular quantum 
critical point (QCP) at zero temperature, but over an ex- 
tended range of a relevant tuning parameter in the phase 
diagram. This highly unconventional behavior has defied 
description within the standard model for metals. 


This provides for a new topic that is vigorously pursued at 
present, and there appear to be a variety of systems that 
exhibit such a quantum critical point. The general picture 
that emerges is now that at the quantum critical point, the 
system can be modelled by an interacting (2+1)-dimensio- 
nal conformal field theory. This effective theory may, de- 
pending on the case, describe emergent Dirac fermions, 
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Figure III.3.20: High Tc superconductivity. The proposed more 
complicated phase diagram for the cuprates exhibiting a high Tc 
superconducting phase near a quantum critical point (QCP). 


scalar (Higgs-like) fields and even emergent U(1) gauge 
fields. So in a sense many of the previously known models 
based on principles of gauge invariance, symmetry break- 
ing and so on make a surprising comeback on a totally 
different stage. But what is most striking is that the original 
electron degrees of freedom are strongly entangled over 
large distances, and manifest themselves in vastly differ- 
ent guises. One says that these conformal phases are no 
longer ‘adiabatically’ connected to the original Fermi liq- 
uid phases. There is no smooth way to connect the two 
regimes. 


What makes the quantum critical point relevant is that the 
behavior persists away from the critical point. So for exam- 
ple there is a well-accepted view that high Tc supercon- 
ductivity, which is effectively realized in the two-dimensio- 
nal layers of certain materials denoted as cuprates, is gov- 
erned by such a QCP as we have indicated in the phase 
diagram of Figure lll.3.20. So the high temperature su- 
perconducting phase would be described by a finite tem- 
perature version of the (2+1)-dimensional conformal field 


theory in question. New insights in these theories, which 
have been inspired by theories of quantum gravity, like 
string theory and the AdS-CFT correspondence that we 
discussed before, definitely look promising in a bid to un- 
ravel the mysteries of these strange metals. String theory 
and hard core condensed matter theory seem strange bed 
fellows at first sight, but apparently science doesn’t know 
of any taboos in that respect. 


Further reading. 
On condensed matter physics: 


— Introduction to Solid State Physics 
Charles Kittel 
Wiley (2004) 


Solid State Physics 
Neil W. Ashcroft and N. David Mermin 
Thomson Press (2003) 


Principles of Condensed Matter Physics 
P. M. Chaikin and T. C. Lubensky 
Cambridge University Press (1995) 


Modern Condensed Matter Physics 
Steven M. Girvin and Kun Yang 
Cambridge University Press (2019) 


On superconductivity: 


— Introduction to Superconductivity 
Michael Tinkham 
Dover Publications (2004) 


On topological media: 


— The Quantum Hall Effect 
Daijiro Yoshioka and D. Yoshioka 
Springer (2010) 


— Introduction to Topological Quantum Computation 
Jiannis K. Pachos, 
Cambridge University Press, 2012 


— Quantum Phase Transitions 
Subir Sachdev 
Cambridge University Press (2011) 


Chapter IIl.4 


scAL E dependence 


In this chapter we explore the notion of scaling. How does 
the behavior of physics change if one changes the length 
or momentum scales? 

We start with some simple geometrical examples of scal- 
ing, leading to the notions of scale invariance, self-similarity. 
and fractals. We move on to discrete maps like conformal 
mappings used by Escher and dynamical systems like the 
logistic map. 

The next step is to study scaling in physical models, both 
classical and quantum. This culminates in the notion of 
renormalization in quantum field theory and the wonderful 
idea of running coupling constants. We discuss what scal- 
ing tells us about the asymptotic behavior of physical the- 
ories like the standard model and the possibility of (grand) 
unification in theories of the fundamental interactions. Fi- 
nally we point out the profound link between scale (and 
conformal) invariance and critical behavior 


What sets the scale? 


When children start building bridges with LEGO they learn 
what construction engineers know too well: if one simply 
keeps scaling up the size of a construction it will at a cer- 
tain point collapse. By simply scaling we mean that we 
multiply all linear sizes by some given factor. One cannot 
simply multiply all beam sizes by a factor two to construct 
a bridge that will span a river twice as wide. The basic 


reason for this breakdown of scaling was given by Galilei 
in his discourse on the two world systems, and boils down 
to the basic observation that the mass of a beam scales as 
a volume, that is a length cube, while the strength of the 
beam would only grow with the transverse area meaning a 
length square. And because the cubic power grows faster 
than the square, at a certain scale the beam has to break 
under its own weight. 


The question ‘what sets the scale’ is a vital one, which one 
had better address before embarking on detailed calcu- 
lations. In physics the answer is determined by, and ex- 
pressed in the available dimensionful parameters of the 
model one employs. Educated guesses are then based 
on what is called dimensional analysis of the parameters 
that are present in the problem. A given particle mass for 
example sets a relevant energy scale in a theory in the 
sense that it separates two regimes defined by energies 
much smaller and much larger than that mass. One ex- 
pects that at low energies that mass is so big that the par- 
ticle will not be excited and therefore will play a negligible 
role, whereas at high energies the field will effectively be- 
have like a massless field mediating long range interac- 
tions, and you expect it to be relevant. 


Amass is adimensionful parameter, and it raises the ques- 
tion what it means to have dimensionless parameters. We 
have already extensively exploited this principle of dimen- 
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Figure Ill.4.1: The spiralling tail. This is the spiralling tail of the 
panther chameleon living in Madagascar. 


sional analysis in Chapter 1.3, devoted to universal con- 
stants, scales and units.! At this point one is tempted to 
claim that a theory with only dimensionless parameters is 
necessarily invariant under rescaling. In that perspective 
it furthermore appears that the behavior of a theory at en- 
ergies much larger than the masses present in the theory 
will approximate that of some scale invariant model. Inter- 
estingly it turns out that this rule of thumb fails in a funda- 
mental way in the quantum domain. This puzzle demands 
a careful analysis of scale invariance in the quantum do- 
main, a topic that we explore towards the end of this chap- 
ter. 


We start by showing some relatively easy to envisage ge- 
ometrical examples of scaling linked to fractals and self- 
similarity. Next we consider simple dynamical systems 


'In this chapter we will adopt the natural units h = c = 1, (except 
where explicitly indicated otherwise) which means that we can express 
all dimensional quantities in units of length, denoted as [x] ~ £, or in 
units of mass (or energy) denoted by [mass] ~ kg, which scales as 
inverse length: [mass] = [length]~' ~ ¢~'. | will from here on express 
all quantities in units of length. 


Figure IIl.4.2: The spiralling snail house. This is a beautiful cut 
of a multi-chamber spiralling house of a snail. The superposed 
red spiral is a so-called Fibonaci spiral that gives a reasonable 
approximation. 


where scaling occurs as a function of the parameter in the 
model. This situation represents a more abstract setting 
for the property of scaling and (broken) scale invariance. 
The first is just the logistic map an iconic model which ex- 
hibits the interesting property of deterministic chaos as the 
limit of an infinite sequence of period doubling transitions in 
the space of solutions. Finally we turn to particle dynam- 
ics and field theory both from the classical and quantum 
point of view. The most surprising and also most difficult 
to understand results concerning scaling are to be found 
in quantum field theory and generally in many-particle sys- 
tems. The crucial observation to be made is that scaling 
can be interpreted as the model following a calculable tra- 
jectory in the parameter space of a class of models. And 
these trajectories may end on certain fixed points where 
the theory becomes scale invariant. However, depending 
on the initial conditions the trajectory may also run off to 
infinity in which case the theory loses its validity and pre- 
dictive power. This is usually a call for other may be new 
physics to be taken into account. 
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Figure III.4.3: The logarithmic spiral. The spiral is given by the 
equation (Ill.4.1) corresponding to the red curve. Under a scale 
transformation r — Ar the spiral is rotated over an angle InA 
corresponding to the blue dashed curve. The curve is therefore 
strictly invariant only under the discrete set of transformations 
where An = exp 2mm. 


Scaling in geometry 


Self similarity and fractals 


Scaling. If we scale an object, we mean to say that under 
a rescaling of the coordinates it transforms into a larger or 
smaller conformal object, an object with the same shape 
but of a different size. If we say that something scales we 
mean that it has a specific behavior under scale transfor- 
mations. For example we may have a geometric object like 
a triangle and ask how it scales when we divide all coordi- 
nates by a factor two, evidently it transforms to a triangle 
‘half the size’. This means that the lengths of the sides be- 
come half as long, and therefore that the area becomes 
one-fourth the original area. If we say that a property 
scales, we mean that it scales like a length to some power 
d, and d is then called its scaling dimension. So a ‘volume’ 


has a scaling dimension three and a ‘point’ has scaling di- 
mension zero. This definition basically coincides with what 
is called the topological dimension n of the (vector) space 
IR", in which the object is naturally embedded. 


So, in this section we address the interesting scaling prop- 
erties are of certain geometric structures and construc- 
tions. 


Scale invariance. If the object were to be the real line R, 
then the scale transformation x — x/2 would map the 
line on itself, and we therefore say that the line as a whole 
is scale invariant. Similarly the spaces R" are scale invari- 
ant. So in that sense scale transformations are part of the 
space-time symmetry similar to translations, rotations or 
Lorentz transformations. However the latter do not change 
the sizes of things, and therefore leave the space-time 
metric (which defines the notion of distance and therefore 
size) invariant. As scale transformations affect the size we 
expect the metric to change by some overall scale or con- 
formal factor. 


The logarithmic spiral. A spiral is a wonderful geometric 
object that has found many stunning applications in nature 
as an efficient format for growth. We show two examples 
in Figures Ill.4.1 and III.4.2. We recommend reading the 
beautiful chapter on ‘The equiangular spiral’ in the famous 
book On growth and form of D’arcy Wentworth Thomson, 
first published in 1942. The ‘equiangular spiral’ is just 
the logarithmic spiral depicted in Figure Ill.4.1, and it is 
specified by giving the polar angle as a function of the ra- 
dius: 


O(r) =Inr. (11.4.1) 
Under a scale transformation r — Ar we find that @ —> 
9’ = InAr = lnr + Ind, : in other words we get the same 
curve back but rotated over an angle InA. So we could 
say that it is invariant under a combined scale transforma- 
tion and rotation over an angle of InA, or we could say 
that it is strictly invariant under the discrete subset of scale 
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transformations, where A, = exp 27m. 


The Cantor set. The Cantor set can be constructed by 
iterating a map starting by removing the middle third of 
the closed unit interval [01] : in other words C; : [0,1] — 
(0, 1/3] U [2/3, 1] and the unit interval is mapped to the 
disjoint union of two smaller copies of itself. The first few 
iterations of this map are illustrated in Figure Ill.4.4. If one 
keeps iterating indefinitely one obtains a tree that is self 
similar, in the sense that every subtree is identical to a 
scaled version of the original tree, and one says that this 
set is self-similar. \t is the prototype of a fractal, which is a 
term that refers to its dimensionality. 


The Hausdorff dimension. A fundamental property char- 
acteristic of the scaling property of a fractal is its non- 
integer Hausdorff dimension, which follows from the map 
that defines the set. At each step we generate a num- 
ber of copies which we call m, and a factor s by which it 
is scaled down. For the Cantor set in the figure we have 
m = 2 ands = 3. The Hausdorff dimension is defined 
as d = In m/ Ins, and for the Cantor set we get the non- 
integer value d = ln 2/1n 3 = 0.631. It is a fractal indeed. 
The definition recovers the integer topological dimensions 
for a line, an area or a volume, as that would amount for 
example to filling a square with four squares of half the 
size, indeed yielding d = In4/In2 = 2. 


Measure zero. The Cantor set itself is a curious mathemat- 
ical object: it is an infinite set of boundary points of (length) 
measure zero. If we start with the unit interval of length 1, 
then at each step we take out 1/3 of each subset. So the 
length that is left over after n iterations is Ln = (2/3)" 
which tells us that Le = 0, showing that it is indeed a set 
of measure zero. 


The Devil’s Staircase, Related to this set is Cantor’s func- 
tion depicted in Figure Ill.4.5. It is a function that maps 
the unit interval onto itself, but it is not one-to-one. The 
function is constant on all regions of the interval that are 


Figure III.4.4: The Cantor set. The Cantor set as the result of 
the infinite iteration of a map where the middle third of the inter- 
val is removed starting with the closed unit interval [0,1]. The 
resulting set is the prototype of a fractal (string), clearly display- 
ing the property of se/f-similarity. (Source: Sam Derbyshire) 


taken out by the infinite iterative process. This function is 
also called ‘The Devil’s Staircase’ and satisfies an intrigu- 
ing functional equation: 

) xe[0,1], (11.4.2) 
that fully captures its scaling behavior. The equation says 
that if we first cut off the curve at x=1/3 and scale it up hor- 
izontally by a factor three, and after that vertically scale it 
up by a factor two, we get the original function back. This 
formula encapsulates its scale invariance property An in- 
structive way to think about this function is to look at it as 
the n — oo limit of an iterative approximation scheme de- 
fined by: 

x 


falx) = 2frailg) 


with initial condition fo(x) = x. So indeed, this staircase 
is develish in that it has an infinite number of steps that in 
some regions become extremely narrow. 
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Cantor function: 


T 
f(a) = 24(2) 


Figure III.4.5: The Devil's staircase. An alternative way to rep- 
resent the Cantor set is as a function from the unit interval [0, 1] 
onto itself, by Cantor’s function, also known as the Devil's Stair- 
case. It is constant on the sub-intervals taken out, and has a 
constant slope in between. One can guess the scaling prop- 
erty of this function from looking at it: it satisfies the functional 
equation f(x) = 2f(x/3), which captures the self similarity of 
the function. 


The Sierpinski gasket. A slightly a more complicated ex- 
ample is the Sierpinski triangle or gasket of Figure III.4.6, 
which is obtained by iterating a discrete map of a shape in 
to a scaled version of itself. It generates an object which 
is self-similar by construction. And if we iterate the map- 
ping indefinitely we would end up with a fractal space that 
would be invariant under a specific set of discrete scale 
transformations. 


The Hausdorff dimension involves again a length down- 
scaling factor s, which for the Sierpinski triangle equals 
s = 2, and a multiplication factor m = 3 as is clear from 
the figure. Therefore the gasket has the fractal dimension: 
d = In3/In2 = 1.58. 


In the figure we have also drawn a yellow fractal curve and 
we may apply the same argument, and because for a line 


y 


AA ÁA 


Figure Ill.4.6: Sierpinski gasket. This geometrical structure 
has fractal properties. It is a self-similar structure. If we take 
the number of scaling steps to infinity it becomes fractal. If we 
scale the dimensions by a factor 2, then the length of the yellow 
curve does not increase by a factor 2 but by a factor 3. This 
means that the scaling dimension of the gasket would be d = 
In3/In2 = 1.58 


segment we have again s = 2 and m = 3 we find the 
same value for the fractal dimension, d = 1.58, validating 
our intuition that the dimension of the gasket is more than 
one and less then two. We may also look at the measure of 
the objects, the area covered by the purple triangles after 
k iterations equals An = (Z)"Ad, which means that the 
limiting area would be A, = 0, so we find again a set of 
measure zero. The length of the fractal curve would tend 
to infinity and its measure is unbounded. 


The disc where Escher and Poincaré met 


In Figure Ill.4.7 we depicted a sequence of images that 
interpolate smoothly between the original Escher art work 
Circle Limit Il and its underlying hyperbolic geometry of the 
disc. This hyperbolic tessellation (or tiling) is composed of 
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Figure Ill.4.7: The hidden geometry of Escher. 
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Figure II|.4.8: Poincaré disc. In the figure we show how to get 
from the hyperbolic plane in orange with geodesics that are hy- 
perbola like the yellow one. These are obtained by intersecting 
with a plane through the origin like the green one. The disk ob- 
tains by stereo-graphically projecting the hyperbolic plane down 
to the unit disc in the z = 0 plane to the point P = (0,0,1) . 


circular segments that intersect the unit circle orthogonally. 
Starting with the hyperbolic square at the center one ob- 
tains the subsequent segments or vertices by mirroring (in- 
verting) the points in the various circular segments as we 
indicated in Figure IIl.4.9. The radial tree connecting the 
nodes is very similar to the binary tree used to construct 
the Cantor set as displayed in Figure III.4.4. 


The hyperbolic plane. For the hyperbolic plane we may 
choose the positive z > 0 sheet satisfying the equation 
x? +y? — z? = 1. It is the yellow surface in Figure III.4.8. 
This hyperbolic plane is not so unfamiliar as you might 
have thought; it is identical to the plane defined by the rela- 
tivistic energy-momentum vectors p, for a particle with unit 
rest-mass living in a flat two-plus-one—dimensional Minkow- 
ski space-time which we discussed in Chapter 1.1. You 
can also view it as the Minkowskian analogue of the unit 
sphere in three Euclidean dimensions (or rather the North- 


ern hemisphere thereof), which obtains if one switches the 
sign in front of the z* term. The geodesics on the hyper- 
bolic plane correspond to any intersection of the surface 
with a plane through the origin like the green plane in the 
figure yielding the yellow hyperbola. These hyperbolas are 
geodesics to be compared with straight lines on the plane 
or the great circles on an ordinary spherical surface. 


The Poincaré disc. The disc geometry that Escher ex- 
ploited corresponds to the so-called Poincaré disc, which 
is the stereographic projection of the hyperbolic plane on 
the unit disc in the flat z=0 plane (light grey in the fig- 
ure) from the point P = (0,0,—1). For a given hyperbola 
one gets a line bundle like the purple surface in the fig- 
ure, yielding a circular segment that approaches the cir- 
cle bounding the disc orthogonally as indicated in Figure 
11.4.7. This bounding circle represents the circle at infinity 
on the hyperbolic plane. These segments accumulate to- 
wards the boundary circle which represents a critical point, 
or a limit like we described in the previous examples. A 
wonderful non-Euclidean construction indeed. 


The Escher tilings. That fractal geometry of hyperbolic 
tessellation of the disc clearly exhibits how the basic ‘am- 
phibian’ gets rescaled and rotated if one approaches the 
boundary, and indeed the number of them tends to infin- 
ity near the boundary. The different hyperbolic tilings can 
be denoted by a pair of integers {n, k}, called a Schafli 
pair, where n is the number of edges of the basic poly- 
gon (n = 4 in this case), and k is the number of edges 
that meet at a vertex (k = 6) under equal angles, equaling 
360/k degrees. Clearly the n angles of the polygon add 
up to 360n/k degrees, and if this sum is less than 360°, 
then we are dealing with a regular tiling of the hyperbolic 
plane. Note that in Chapter III.2 in the section on crystal 
structures we discussed the tilings by regular polygons of 
the plane, where indeed the condition k = n could only be 
satisfied for k = 3, 4, and 6. 


Maurits Escher himself was not a mathematician, but his 
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Figure III.4.9: Inversion map. This map defines for any point 
(r,0) outside the unit circle bounding the disc a mirror point 
(1/r,®).. If a circle crosses the disc, points on the inner and 
outer segments connected by a radial line through the center 
are mapped onto each other. The Escher disk combines this 
mirroring in ever smaller circles with mirroring in a symmetry 
axis through the center of the disc, as is indicated in the last 
picture of the previous figure. 


work - not surprisingly - attracted much attention from math- 
ematicians. This started at the International Congress of 
Mathematicians in Amsterdam in 1954, where one of the 
organisers, N.G. de Bruijn, had arranged for an exhibition 
of Escher’s work in the Stedelijk Museum. In particular 
the British mathematician H.S.M. Coxeter had many ex- 
changes with Escher on the mathematical meaning and 
interpretations of his work. It is clear that the interactions 
fascinated and inspired Escher, but it is also clear that he 
kept doing the mathematics in ‘his own way:’ 


My great enthusiasm for this sort of picture and my 
tenacity in pursuing the study will perhaps lead to 


For the mathematics of Escher’s work | refer to the book edited by 
H. F. M. Coxeter, M. Emmer, R. Penrose and M. L. Teuber (M.C. Escher: 
Art and Science) and an article by Doris Schattschneider (Notices of 
the AMS, Volume 57, Number 6, 2010). 


Figure III.4.10: Logistic map. This iterative map defines a dis- 
crete dynamical system on the unit interval (0 < x < 1) andis 
given by xn+1 = f(xn) = TXn(Xn —1). 


a Satisfactory solution in the end. ... it seems to 
be very difficult for Coxeter to write intelligibly to a 
layman. Finally, no matter how difficult it is, | feel 
all the more satisfaction from solving a problem like 
this in my own bumbling fashion. 

Escher in a letter to his son George 


Escher used the term coxeteering for his incredibly imagi- 
native and creative explorations of the hyperbolic disc and 
its tessalations in a series of prints he called Circle Lim- 
its. 


Scaling in dynamical systems 


The systems we have been looking at so far have been 
completely geometric where the scaling patterns were quite 
obvious from the start, but now we want to explore the do- 
main of dynamical systems where scaling behavior can be 
more hidden but highly non-trivial. We start with the /o- 
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k | eycle(2") Tk 

i |2 3 

2 |4 3.449490 

3 |8 3.544090 

4 |16 3.564407 

5 |32 3.568750 

6 |64 3.56969 

7 |128 3.56989 

8 |256 3.569934 

? |a 3.569943 

10 | 1024 3.5699451 

11 | 2048 3.569945557 
œo | accumulation point | 3.569945672 


Table III.4.1: The bifurcation sequence. 


gistic map which is a (discrete) dynamical system which 
exhibits scaling behavior in its parameter space {r}. 


The logistic map 


The logistic map is a canonical example of a system which 
displays what is called deterministic chaos. Itis an iterative 
map of the unit interval (0 < x < 1) onto itself, where each 
iteration corresponds to a time step. The map is quadratic 
and given by 

Xn+1 = F(xn) =T Xn(xn— 1) (n= 1,2,3,...). (4.3) 
It is plotted in Figure III.4.10 for three different values of the 
parameter r. This is one of the most well-studied equa- 
tions in mathematical physics with a vast literature dedi- 
cated to its remarkable properties. 


In Figure III.4.11 we have in the left column depicted the 
orbits corresponding to the first fifty iterations of the map 
with initial value x9 = 0.2, for three values of r. What we 


see is that with increasing values of r the behavior of the 
orbit for n >> 1 changes drastically. 


For small r it starts with a fixed point, then we get into a 
region where the orbit becomes a 2-cycle, after which one 
obtains ever smaller regions where the period doubles to 
some 2‘—cycle. In the second column the same orbits 
are represented as a cobweb diagram where the succes- 
sive steps are obtained by mirroring the outcome of the 
n-—th iteration in the line y = x to obtain the input for the 
(n + 1)—th iteration. In these diagrams the limit cycle be- 
havior is very clear. In the right column we have depicted 
the so-called bifurcation diagram, which shows what the 
cycles are as a function of r and at what values the period 
doubling occurs. For increasing r the points rx , where the 
period doubles occurs and the 2-cycle starts, accumulate 
at some critical point rœ = 3.56995... , where a transition 
to chaotic behavior occurs. 


The bifurcation diagram of Figure III.4.12 suggest that there 
is some form of self-similarity present in this system and 
it was Mitchell J. Feigenbaum who in 1978 extracted two 
fundamental constants from the system that characterize 
the scale invariance of the system near the critical point 


Tiss 


The first Feigenbaum constant is given by the limiting be- 
havior of the following sequence (see figure): 


N mi cit Oe ee Gr 
k= Tk+1 — Tk k—00 dk+1 


(111.4.4) 
This number ô is universal in that it does not depend on 
the details of the map as long as it has quadratic behav- 
ior near the maximum and vanishes at the endpoints of 
the interval, and it turns out that this constant governs the 
asymptotic behavior of all period doubling sequences. One 
might rephrase the above equation by saying that for large 
k > 1 the interval dý = ro —rx converges like a geometric 
series dt ~ C5-*. 
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Orbit Cobweb Bifurcation plot 


Figure III.4.11: Logistic map orbits. We show orbits starting at x = 0.2 for different three different r values (r = 2.8, 3.2 and 3.8) in 
the first column. In the second column the same orbits are given as ‘cobwebs.’ In the final column we marked the corresponding r 
values in the bifurcation diagram. 
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Figure III.4.12: Bifurcation diagram. This diagram gives the x 
values in the subsequent 2* limit cycles as a function of the r 
parameter of the logistic map. The limiting behavior does not 
depend on the initial value xo, and forms therefore a global at- 
tractor. Starting at small r the sequence of points rą where 
the doubling to a 2*-cycle starts accumulates at some point 
Too = 3.56995..., after which a highly unpredictable limiting be- 
havior sets in, which is called deterministic chaos. 


There is a second universal constant that can be extracted 
from the diagram. It is determined by the limiting behav- 
ior of the sequence of separations sk, where sx is the 
separation in x between the two adjacent central values 
of the 2*-cycle at r = Tki1, aS we have indicated in the 
figure. For large k one finds that s,.; = s,/a where 
a =2.5029.... 


The essential scaling property of the limiting behavior of 
the period doubling sequence is expressed by a scaling 
function g(x) , which would be the solution of a functional? 
equation analogous to equation (III.4.2) for the devil’s stair- 


3A function f(x) is a mapping from a space X of the variable to some 
space of function values, like the real line R or the complex plane C. 
Formally a functional is a ‘function of a function’ and corresponds to a 
map from a space or a certain class of functions F to a space of values 
like RorC. 


The Feigenbaum-Cvitanovic function 


g(x) = ag(g(x/ca)) 


The F-C 


Figure IIl.4.13: Feigenbaum-Cvitanovic function. 
function can be compared with Cantor’s staircase function. It 
captures the strange attractor of the logistic map. The function 
satisfies the F-C functional equation g(x) = ag(g(x/«)) . 


case. The equation for the period doubling sequence is 
called the Feigenbaum-Cvitanovic equation: 


g(x) = og(g(x/e)) , (11.4.5) 


with boundary condition g(0) = 1. There is a unique solu- 
tion to this equation that fixes both the value of « and the 
function g(x) which you should think of as specifying the 
attractor at the accumulation point (the set of 2* points in 
the limit k — oo). The F-C function is plotted in Figure 
Ill.4.13, and could have been called the ‘devil's castle’ be- 
cause the embattlements contain ever smaller self-similar 
versions of the castle. A stunning architectural master- 
piece obviously. A remarkable property of this equation 
and thus its solution is that it is independent of the precise 
form of the logistic map f, and it is in that sense that the 
parameter alpha is universal over the class of functions 


{f}. 
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Scaling in quantum theory 


Quantum mechanics 


In earlier chapters we have argued that (continuous) space- 
time symmetries lead to conserved quantities, and that 
in quantum theory these conserved quantities are repre- 
sented by certain operator expressions that act on the Hil- 
bert space. These operators are expressed as functions 
of the basic degrees of freedom. So, in the quantum me- 
chanics of a particle the basic operators are X and P, cor- 
responding to the classical phase space coordinates x and 
p. And from these one can construct the operators for 
other dynamical variables like the energy or angular mo- 
mentum. The operators work on the Hilbert space of wave 
functions. In quantum field theory the basic operators are 
the fields themselves and their conjugate momentum fields 
and these work on the multi-particle Hilbert space. 


Operators that represent space-time symmetries. In 
a quantum system symmetry operators commute with the 
Hamiltonian, and therefore transform states that have the 
same energy among each other: in other words, states 
that are degenerate. We recall that for the case of the hy- 
drogen atom, the energy levels are labeled by the principal 
quantum number n, and for any n we have an n* degen- 
erate set of states. This set consists of representations of 
the rotation group SO(3) labeled by the angular momen- 
tum eigenvalues l, with Ll = 0,...,n — 1. At a given en- 
ergy level n the total degeneracy can be understood if one 
adds the Runge-Lenz vector, to be thought of as a vector 
of symmetry operators to the symmetry algebra. This is a 
dynamical symmetry which follows from the particular form 
of the Coulomb (or Newton) potential and is not related to 
an underlying space-time symmetry. Inclusion of this vec- 
tor extends the symmetry algebra from so(3) to so(4) , as 
we discussed in connection with Figure 11.6.3 in Chapter 
11.6. 


Let us now turn to the expression for the operator A that 
generates scale transformations on a one-particle Hilbert 
space. We do so after we have recalled how it worked for 
the case of translations. 


The case of translations generated by momentum. In 
previous chapters we discussed how in quantum theory 
the momentum operator P acting on a wave function is 
represented as the Hermitean differential operator P = 
—id/dx (h = 1). This operator generates ‘translations’ 
meaning to say that if we act with a finite transformation 
on any function 


T(a)f(x) = el? #(x) = f(x + a), 


then the argument of the function is shifted by an amount 
a. The momentum operator has a continuous set of eigen- 
functions f;,(x) ~ e'** because: 


P f(x) = kf, (x) . 


These functions are periodic and the expansion of an ar- 
bitrary function in this basis of eigenfunctions amounts to 
a Fourier decomposition of that function. Needless to say 
that the only translation invariant function is the constant 
function, corresponding to k = 0. Finally we recall that 
translational invariance of a system implied that the mo- 
mentum operator would commute with the Hamiltonian, 
and henceforth momentum would be conserved. 


The scaling operator A and its eigenfunctions. Now we 
ask the same questions about scale invariance: what is the 
operator representing scale transformations on functions, 
and what are its eigenfunctions, and finally, what does it 
mean to say that a system is scale invariant? The scale 


d PPE : 
operator is A(x) = xax and its eigenfunctions are quite 
simple to derive: 


xŠ galx) = dgalx) 


= Inga = dlnx =Inx?. 
(II1.4.6) 
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So again there is a continuum of eigenfunctions which are 
just powers of x : ga(x) ~ xt for any d. The eigenvalue 
d is called the scaling dimension. Under a finite scaling 
transformation S(c) we would get: 


S(oc)ga(x) = e™'/™) ga(x) = e%4ga(x). 


This expression gains transparency and elegance if we 
take the parameter logarithmic: 


S(InA)ga(x) = et ™ga(x) = A%ga(x) = ga(Ax). 


Power laws. This gives an alternative way to define a scal- 
ing function in general; it is any function h(x) that satisfies 
the scaling law: 


h(Ax) = Ath(x), (11.4.7) 


for any A, where the power d is defined as the scaling di- 
mension of the function. Indeed the scaling functions are 
the eigenfunctions of the scaling operator and are just sin- 
gle powers of their argument. A scale invariant function is 
the eigenfunction with d = 0, again meaning any constant 
function. 


We just saw that making the scale transformation S(InA) 
on an eigenfunction effectively multiplies the argument of 
that function with A. This is a special property in the sense 
that it multiplies the argument and not the function. Thus, 
if | apply the operator to an arbitrary linear combination of 
eigenfunctions, | get exactly the same combination back 
with scaled argument. In other words if we think of an ar- 
bitrary function that can formally be expanded in a power 
series, then what the scale transformation S does is just 
to scale the argument of that arbitrary function. This is to 
be expected because it is the defining property of a scal- 
ing transformation on any function, but it does not imply 
that any arbitrary function is a scaling function, as it will in 
general not satisfy the scaling property (III.4.7). 


The symmetry algebra including scaling. To further dis- 
cuss scaling properties it is useful to study its commutation 


relations with other elementary operators forming the dy- 
namical Lie algebra. For example from 


i . 


i 
[A,X] = = [XP, X] = = (XPX — XXP) = _ XP, X] = X 


i ; 


De je 


(XPP — PXP) = “Dx, P]P =—P 
(II1.4.8) 


It gives the operator back multiplied by its naive scaling 
dimension, which is the dimension of the operator in units 
of length. Note that the angular momentum operator has 
scaling dimension zero as it involves products of X and P 
components; this is also consistent with its quantization in 
integer multiples of h which at this point is dimensionless 
as it has units Js ~ 0°. 


The calculation we just did shows that we can extend the 
combined Lorentz and translation symmetry, denoted as 
the Poincaré group, with the scale transformations. Includ- 
ing the scale transformations we also need to include the 
so-called inversion operator I with I : x > x/x*. Adding 
these two operators to the dynamical operator algebra, 
one ends up with a closed Lie algebra with fifteen genera- 
tors, which is referred to as the conformal algebra which for 
four-dimensional Minkowski space is the algebra so(4, 2). 
This algebra corresponds (is isomorphic) to the ‘rotations’ 
in a six-dimensional ‘space’ with four space and two time 
dimensions. 


So far we have mainly discussed mathematical features of 
scaling functions and operators. Let us now return to the 
physics of scale invariance. We do this at various levels 
of increasing complexity starting with simple classical sys- 
tems and moving up to applications of scaling in quantum 
(field) theory. 


Scaling properties of some Hamiltonians. Having the 
scale operator it is interesting to see what one can learn 
about the scaling properties of some Hamiltonians and 
other operators. 


556 


CHAPTER 11.4. SC A L E DEPENDENCE 


To keep it simple we look at a particle with Hamiltonian 
H = U + V or Lagrangian L = U — V, where kinetic 
term U = P?/2M and for the potential we choose a sim- 
ple power, V = a,x. Now for consistency we must have 
that [H] fu] [V] = £! . This implies that indeed 
[U] = [M]! - 2[P] = £! , as expected. For the poten- 
tial term we find that [V] = [ax] - £K = £7! , from which we 
conclude that [a,] = ¢-'~*, so this simple power count- 
ing yields the dimensionality of the parameters or coupling 
constants. 


We see that the kinetic and potential terms will in general 
scale differently under scale transformations of the coor- 
dinates. Just transforming coordinates and keeping the 
parameters fixed we get that: 


1 2 
tasks AsH e HANA: 


A ~ 2M 
This expression leads us to conclude that under a rescal- 
ing of the coordinates the Hamiltonian is mapped into a 
similar Hamiltonian H(A), with different, scale dependent, 
parameters: M’ = M(A) = AM and ay = a,(A) = 
Nan. Ax. 


Let us look at some simple cases: 


1. The harmonic oscillator. 

The potential is given by V(x) = 5K x? , and corresponds 
to the case k = 2. The spectrum is depicted in Figure 
11.5.14 on page 396 of Part Il. It is equally spaced, with en- 
ergy levels En = w(n+ 5) where the frequency w is given 
by w = ,/K/M. The frequency is the only physically rel- 
evant parameter and we see that its scale dependence is 
: wA) = \/K(A)/M(A) = Aw. The spectrum apparently 
scales linearly with A. 


The concept that we want to emphasize is the fact that un- 
der scaling the theory changes. If we define the theory as 
a point in the space of parameters, then under rescaling 
the theory will trace out a trajectory in that space. In the 


Figure Ill.4.14: Scaling trajectory of harmonic oscillator. Scal- 
ing the coordinate by a factor A in the harmonic oscillator Hamil- 
tonian is equivalent to a trajectory of the parameters M(A) and 
K(A) through parameter space. 


example at hand, the parameter space is a plane with co- 
ordinates M and K. With M(A) = AM and K(A) = A°K, 
we see that we can eliminate the A, to obtain a function 
K(A) = (K/M?)M(A)?. We have depicted one such trajec- 
tory for a particular initial condition K(1) = K and M(1) = 
M in Figure III.4.14. 


What we learn from this graph is not earth-shattering, just 
that for large values of A, the potential term starts to dom- 
inate so that the system will get locked into the ground 
state. On the other hand if A — 0 the kinetic term domi- 
nates and the hamiltonian approaches that of a free parti- 
cle, where the energy gap tends to zero. So at short dis- 
tances the theory has a fixed point where the theory is 
free, a primitive precursor of the notion of what is called 
‘asymptotic freedom’. This is not so surprising, because 
it is what we could have concluded directly from the linear 
A dependence of w(A) , which implies that the energy gap 
tends to zero. 
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2. The hydrogen atom. 


The part of the Hamiltonian that is of interest here is the 
radial part because when we scale the coordinates we 
rescale the r variable and not the angular variables O and 
ọ . This reflects the fact that the angular derivatives of the 
Hamiltonian are all contained in the the term L?/2Mr?, 
where the angular momentum operator L = X x P. In 
view of the relations (III.4.8) the scaling dimension of L is 
d = 0, and thus, as stated before, the angular momentum 
is scale invariant. So, we are left with the ‘radial’ Hamilton- 
ian, which is very similar to the one given in equation (1.4.1) 
we discussed in Chapter 1.4, it takes the form: 


TE pŽ (+1) e? 
2M 2M? 4rnr 

where lis the angular momentum label, and (l + 1) is 
the eigenvalue of the operator L?. Doing the scaling ex- 
ercise as before we find that M(A) = AM and, interest- 
ingly, that the charge does not rescale e(A) = e. Let us 
look what that implies for the spectrum in this case, the 
discrete bound state energy levels are labeled by the prin- 
cipal quantum number n , and are given by: 


Ey , oa: 
En= zz with Ei= Miz) 
We conclude that the levels simply scale like En (ÀA) = AEn, 


confirming our naive expectations. 


On the one-particle level the quantum analysis of scal- 
ing properties does not lead to surprising new insights. 
It merely confirms the behavior you would expect on the 
basis of naive dimensional analysis. As we will see in the 
remaining sections of this chapter it is in quantum field the- 
ory that interesting complications arise. 


Quantum field theory 


In this subsection we turn to the question what scaling 
means in quantum field theory. We will look at this problem 


from a rather general and abstract point of view, avoid- 
ing as many technicalities as possible. In later sections 
we give more details about how these results can be ob- 
tained. 


The fundamental question is again to understand how pa- 
rameters of the model change depending on the scale at 
which one looks at the system. And as the quantum uncer- 
tainty relations imply an inverse relation between spatial 
scale (wavelength) and momentum or energy, we expect 
to learn something about the energy dependence of the 
phenomena the theory describes. By exploiting arguments 
like the ones we used in the previous subsection we may 
even probe the domain of validity of certain theories. 


Actions and Lagrangians. In general a theory can be de- 
fined by its energy function or Hamiltonian H, or its action 
S. As mentioned before, in relativistic systems and field 
theories, one prefers the action because it is a manifestly 
Lorentz invariant quantity, while the energy is not as itis a 
component of the energy-momentum four vector. 


The action can be written as a functional of the field, a 
space-time integral over a Lagrange density L, which is an 
expression in the fields and their derivatives. We write: 

$= [catx, (III.4.9) 
and in units where h = c = 1 the action is a dimensionless 
quantity. At this point the difference between the quan- 
tum and classical expression resides completely in the in- 
terpretation of the fields. Classical fields are just scalar, 
or vector, or spinor valued functions on coordinate space. 
Quantum fields are very different types of objects: they 
are operator valued and work on some multi-particle Hil- 
bert space as we discussed in Chapters |.4 and II.5. 


Three examples. In the remaining sections of this chapter 
we will refer to the three different examples of Lagrangian 
densities we introduce next. 
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- The * model. The first action is about the simplest non- 
trivial field theory one can think of and it owes its popular- 
ity exactly to the fact that it is often used to demonstrate 
the intricacies of quantum field theory. It is a theory of a 
real scalar field p(x") with a quartic self-interaction. The 
action of this so-called ’-fourth theory’ is defined by the 
relativistic Lagrangian density £: 


L(h, Ou) = Pu)? + lmg + Mos, (II1.4.10) 


The classical field ¢ is just an arbitrary function which may 
be expanded in an orthonormal set of basis functions, for 
example energy momentum eigenstates or planes waves 
{b(k)}: 

(x) ~ | bk) eH ak, 


In Chapter II.5 we pointed out that in quantum field theory 
the fields are operators acting on a multi-particle Hilbert 
space and can create or annihilate particles in any given 
energy momentum state labeled by k! . with k? = m? (m 
= rest mass). The first two terms of the Lagrangian are of- 
ten denoted as £o , and being quadratic in the fields, they 
make up the free field theory. The last term denoted by 
Lint describes the self-interactions of the field with cou- 
pling strengh A4. 


- The toy model. Of course a field theory can be defined in 
any number of space-time dimensions, and formally noth- 
ing forbids us, for pedagogical reasons, to restrict our- 
selves to a theory with only a time dimension. Then the 
field becomes just like a time-dependent position coordi- 
nate ġ(t) ~ x(t). We may even go one step further, as we 
will do here, and consider a zero-dimensional field theory. 
‘That is not much of a theory’, you might complain, and 
your point is well taken. Zero-dimensional means there 
is no space and no time, so the ‘field’ has just a single 
constant mode (like the zero-energy mode of the theory 
above), so the ‘field’ is just a real or complex variable. It is 
very much a toy model that we only introduce to illustrate 
at a very basic level what the effect of quantum corrections 
in a field theory look like. 


Our toy model only has two real modes: a light mode with 
‘mass’ m denoted by ọọ, and a heavy mode with ‘mass’ 
M >> m denoted by x and is defined by a simple polyno- 
mial action: 
2 2 

S(@,x) = “5 ep 4 M x + Mohe. (1.4.11) 
This action has no derivatives; the terms quadratic in the 
fields represent the free modes and the quartic term de- 
scribes the interaction between the two modes. This very 
rudimentary theory will in the next section be used to illus- 
trate certain structural (diagrammatic) aspects of perturba- 


tion theory and Feynman rules. 


- Quantum Electrodynamics The third example is the La- 

grangian for QED, the theory we discussed already in Chap- 
ter 1.4 and in the section on gauge invariance in Chapter 

1.6, 


L= EFE” ++ m1 +eA)p. (11.4.12) 


Let us make some observations about this Lagrangian: 

(i) It is a compact expression of which each part is mani- 
festly Lorentz and gauge invariant. 

(ii) Besides the Maxwell field describing the photon, and 
the Dirac field describing the electrons and positrons, the 
action contains two parameters: the electron/positron mass 
m and the coupling constant corresponding to the electron 
charge e. 

(iii) The first three terms are quadratic in the fields and rep- 
resent the free part of the action. The first term gives rise 
to the free photon propagator, while the second and third 
correspond to the free electron/positron propagator. In the 
Feynman diagrammatic language these propagators cor- 
respond to the wiggly and straight lines that were shown 
in Figure 1.4.28 in Chapter 1.4, while the final interaction 
term corresponds to the interaction vertex diagram of Fig- 
ure 1.4.29. They are also shown in Figure III.4.19. 


The naive scaling dimensions of fields. To be able to 
discuss the scaling properties we first determine the naive 
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Field | Dimension || Coupling | Dimension 
d(x) —] m —] 
An 4—n 
A(x) =| e 0 
V(x) 3 


Table III.4.2: Scaling dimensions in 4-dimensional space-time. 
We have listed the naive ‘power counting’ scaling dimensions 
in units of length of some fields and coupling constants. The 
self-couplings An refer to terms in the energy density of the type 
Ano”. Note that the quartic coupling for scalar field is dimen- 
sionless: [Aq] = 0°. 


scaling dimensions of the fields, which are obtained by ap- 
plying dimensional analysis. A good starting point is the 
action which in the system with h = c = 1 has dimen- 
sion zero: [S] ~ 0°, and the Lagrangian therefore has units 
[L] ~ ¢-*. From the quadratic terms in the Lagrangian of 
the scalar field given by equation (III.4.16) we learn that 
the dimension of the field has to be [p] ~ ¢~'. Conse- 
quently the quartic self-coupling of the Ad* term A has to 
be dimensionless. For the Maxwell field the Lagrangian 
[CL] ~ P ~ (0A)* and as [£] ~ ¢-* we conclude that the 
gauge potential, like the scalar field, scales like [A] ~ ¢—!. 
From the mass term for the Dirac field ~ mapp we obtain 
that [p] ~ 2-3/2. And from the interaction term eA we 
subsequently verify that the coupling constant e is dimen- 
sionless. We summarize the naive scaling dimensions in 
units of length, of the various fields and coupling constants 
in the Table III.4.2. 


Scaling in classical field theory. Assigning these scal- 
ing dimensions to the fields allows us to discuss the scale 
invariance of classical field theories. To find out we make 


a scale transformation of the coordinates x — Ax. The 
fields being space-time dependent will transform accord- 
ingly, like p(x) + @(Ax) = A*@(x) . Note that the parame- 
ters do not transform under this coordinate transformation. 
After the transformation of the coordinates and fields we 
see that most terms in the action are invariant, and only the 
mass terms change in the sense that m — m’ = Am. The 
net effect is that after the transformation you get the same 
theory back but with a different mass parameter. This ar- 
gument shows that already at the classical level rescal- 
ing corresponds to the theory moving through parameter 
space. A further message is that in classical theories the 
mass terms break scale invariance. Massless theories like 
the Maxwell theory are therefore scale invariant. In fact 
these results also hold for the classical approximation of 
the quantum theory, where we think of the field excitations 
as particle states but where we ignore the typical quan- 
tum corrections as will become clear shortly. In the quan- 
tum domain we have to take into account the inverse re- 
lation between length scales (wavelength) and momentum 
or energy scales. This implies that if we scale the theory 
by large A we effectively take the low energy, long wave 
length limit which means that the mass is relatively large 
and in the limit would become the dominant term. In that 
regime we cannot excite particle modes and there is no 
dynamics left. If we take the opposite A — 0 limit, then 
we study the theory in the high-energy regime where the 
mass effectively plays no role! And at this level of the dis- 
cussion we would be tempted to conclude that theories be- 
come scale invariant in the high-energy limit. However, this 
conclusion turns out to be premature because taking the 
quantum corrections into account we will see that these 
break this naively expected scale invariance. 


Quantum complications. To make sensible predictions 
in quantum field theory that can be compared with exper- 
iment, the calculations which are perturbative in nature, 
require a renormalization program to be executed.* It is 


t1 must admit that this sounds like the theory is ‘abnormal’ and has 
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exactly this renormalization program, which involves cut- 
ting off certain momentum integrals that is responsible for 
the scaling violations. These violations lead to anomalous 
scaling dimensions for the parameters and fields. 


We will try to elucidate some of the outstanding features 
of that program. One we have mentioned already is that 
rescaling the theory is the same as effectively rescaling 
the parameters in the model. What one finds is that po- 
tentially at every successive step in the quantum approxi- 
mation new interaction terms may appear in what is called 
the effective action. In other words, conceivable terms that 
had zero coefficient in the classical theory one starts with, 
may become non-zero. And the behavior of the theory un- 
der rescaling depends on to what extent these extra terms 
are relevant at the scale one is interested in. The strong 
requirement of renormalizability means that only a finite 
number of scale dependent renormalizations of parame- 
ters and fields is needed to render the calculations finite to 
any order. This implies that systematic quantum calcula- 
tions can be made which lead to unambiguous predictions 
for physical observables to arbitrary precision. 


The Euclidean path integral 


As we pointed out in the subsection on statistical mechan- 
ics in Chapterl.1, there is a interesting analogy between 
the statistical description of multi-particle classical physics 
and quantum physics, in spite of all their fundamental dif- 
ferences. This is not too surprising because after all, a field 
has an infinite number of modes that represents an infinite 
number of local degrees of freedom and we learned that 
quantum field theory defines a Hilbert space with states 
that can have any number of particles in it. 


to go to a camp to be ‘renormalized, through a process of ‘ideological 
purification, in order to adapt it to the ‘new normal’. This terminology of 
course started with normalizing wave functions and distributions, just 
meaning imposing a norm, saying nothing about wavefunctions being 
normal or not. 


In classical statistical physics we can derive the thermo- 
dynamic properties from the partition function, Z which is 
the sum or integral over the phase space T of the system, 
weighted by the Boltzmann factor, 


Z= [ep HaT dr, 


where H = H(T) is the Hamiltonian of the system, the in- 
tegral of the energy density over all of space. An important 
quantity is then the (Helmholtz) free energy F defined as 
F = —kT ln Z. What we showed in Chapter I.1 was that 
the free energy was equal F = U — TS. And we worked 
through the example of the ideal gas in the section on page 
53. One thing is obvious, the (classical) statistical physics 
underlying thermodynamics becomes racing a dead horse 
if the temperature is zero, because there is no thermody- 
namics as everything is stuck in its lowest state. But that 
is different in the quantum domain. 


The analogy. Quantum field theory is basically a theory at 
zero temperature, though of course a temperature can be 
introduced in addition. But what makes quantum field the- 
ory at zero temperature already interesting is that there are 
always quantum fluctuations present in the system. This is 
an unavoidable consequence of the uncertainty principle. 
Indeed, the role of thermal fluctuations is taken up by the 
quantum fluctuations, and instead of the temperature the 
external parameters are typically Planck’s constant and 
possibly some coupling constants. In some sense you 
could argue that Planck’s constant takes the place of Boltz- 
mann’s constant and the external parameter that plays the 
role of temperature is a fundamental coupling strength ap- 
pearing in the theory. And indeed whereas the free en- 
ergy governs the classical phase diagram depending on 
the thermodynamic variables like P, V, T and S, that role is 
now played by the masses and coupling constants. There- 
fore one may expect different quantum phases and phase 
transitions to occur in different regions of parameter space 
even at zero temperature. 
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Statistical Physics Quantum Field Theory 
Phase space fi Field p 
Energy function H(T) Euclidean Action Sio] 
Partition function | Z = f e HOT ar Path integral Z= |e ddl 
Free energy F= —kTlnZ Effective action Sett = Aln Z 


Table III.4.3: Correspondence between the fundamental concepts of (classical) statistical physics and quantum field theory. 


The path integral. This fascinating analogy between clas- 
sical statistical physics and quantum field theory becomes 
much more tangible once we introduce the Euclidean path 
integralas a tool to do calculations in quantum field theory. 
In quantum theory we define the (Euclidean) path integral 
or quantum partition function, as a weighted sum over the 
classical configuration space, where each configuration is 
weighted by the exponential of its classical action: 


Z= [eseomag). (111.4.13) 


So indeed, the path integral approach to quantum theory 
does away with wave functions and in fact with Hilbert 
space, but shows that the same information on quantum 
amplitudes can be extracted from the corresponding clas- 
sical expressions, and averaged over all paths or classical 
field configurations that match the required boundary con- 
ditions. Of course this integration over infinite-dimensional 
spaces is not simple and to properly define it one encoun- 
ters a lot of mathematical pittfalls. It requires defining a 
proper integration measure [d®] for the ‘space of field con- 
figurations. But even having a suitable measure, calculat- 


ing the integral exactly, is too much to hope for, and the 
best one has been able to do in general is to develop a sys- 
tematic approximation scheme by expanding the expres- 
sions in a power series in h and the coupling constants, 
using Feynman diagrams and rules. These calculations 
are notoriously subtle and require a rather unusual arse- 
nal of skills. | will avoid all these highly relevant technical- 
ities here, but nevertheless continue the overall narrative, 
plainly quoting the results along the way if we need them. 
And this way | hope to be able to convey the central ideas 
and discuss what they mean. | refer interested readers to 
the final section of this chapter where we go a step further 
in explaining the perturbative approach and consider some 
specific quantum processes in more detail. 


In the comparison with statistical physics the temperature 
parameter is replaced by some coupling constant times h, 
and S is now the classical(!) (Euclidean) action which is 
equal to the Lagrangian density integrated over all of Eu- 
clidean space-time. The integral involves ‘imaginary time’, 
which means that we set t — it, so that the flat Minkowski 
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space-time just becomes 4-dimensional Euclidean space 
with x4 = T. The idea is that in Euclidean space the math- 
ematical manipulations are much simpler and in particular 
more convergent than in Minkowski space. But the price 
one has to pay is that after the calculation is finished one 
has to ‘rotate’ back to Minkowski space-time in order to 
interpret the results. 


The effective action. The analogue of a free energy is 
then the so-called effective action Sept . : 

Sere =hInZ (1.4.14) 
And as in the definition of Z we have summed or inte- 
grated over all field variables, the effective action only de- 
pends on the parameters of the theory. This function or its 
derivatives could become discontinuous, signalling what 
we have earlier called quantum phase transitions. A strong 
way to express this analogy is to say that quantum the- 
ory in d spatial dimensions is just statistical mechanics 
in (d + 1) dimensions, where the Euclidean action of the 
d-dimensional space become a ‘would be’ (d + 1)-dimen- 
sional Hamiltonian. An example of this was provided by the 
d = 2 Ising model (discussed in the section on magnetic 
order in Chapter Ill.2), where one encounters a quantum 
phase transition at zero temperature at some critical value 
of the external magnetic field. It has been shown that the 
characteristics of that transition indeed correspond to the 
d = 3 classical Ising model. 


In Table III.4.3 we have summarized the correspondences 
between statistical physics and quantum field theory. And 
it should be said that this Feynman path integral approach 
to quantum theory is in many ways complementary to the 
operator, Hilbert space approach, and has led to many 
new and valuable insights into the quantum world. It has 
become an indispensible tool in our modeling and under- 
standing of physical reality. 


Scaling and renormalization II 
In this section we discuss scaling properties in a generic 
way, following the renormalization group approach of Ken- 
neth Wilson using the language of the Euclidean path inte- 
gral and the effective action as introduced in the previous 
section. We apply the formalism to the @—fourth model. 
Wilson received the Physics Nobel prize for his work in 
1982. 


The Wilson approach to renormalization. 
The starting point is to define the theory with momentum 
cut-off A: 


Z =| Dohe (— | Lo dfx), (111.4.15) 


with the @—fourth bare Lagrangian density: 


L(h, uh) = Toup)? + lmg + Mot, (I11.4.16) 


The integration is over all space-time field configurations 
and has a measure with some momentum cut-off: 


[Dol = | [ a(x). (11.4.17) 


|k|l<A 
You can think of it in the following way. Any field con- 


figuration can be expended in a complete set of energy- 
momentum eigenfunctions, 


b(x) =} ax be(x). 
k 


Integrating over the field configurations basically means 
that you integrate over the space of expansion coefficients, 
so the measure is then simply: 


[Dola = | | da, 


Ikl<A 


where the integral is only performed over the a, with k < 
A. The importance of the cut-off is that all integrals are 
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calculable in principle but the results may depend on the 
cut-off. 


Integrating out high momentum modes. We continue by 
splitting the field modes depending on their momentum by 
defining: 


< Sn te > | O(k) forbA<k<A 

PaO wie af 0 otherwise. 

(1.4.18) 
Now we have to expand out the Lagrangian and split it 
in the part that depends only on p< that has the same 
form as £ and the part that depends on both < and p” 
and their derivatives. The path integral then becomes a 
product of two factors. The idea is to perform the integral 


over the :)~(k) components in the second factor. 


The effective Lagrangian. The theory obtained after inte- 
gration over these high momentum modes is an effective 
theory for the field but now with a cut-off bA: 


Z= | [Dolpa exp ( — | Lett d’x) , 


where the effective Lagrangian Lett will be equal to £ plus 
an infinite number of correction terms in increasing powers 
of the coupling constant A4 and the field p and its deriva- 
tives. The calculation of this expansion is a complicated 
matter and will not concern us here because the qualita- 
tive features we want to address can be discussed with- 
out. The philosophy is similar to a calculation we will do in 
the toy model in the final section, in that by integrating out 
a high-mass variable x we obtain an effective Lagrangian 
which can be thought of as an infinite power series in the 
remaining low-mass variable . In the toy model this can 
be done explicitly, and therefore gives you a good idea. In 
the situation here we deal with fields and their derivatives, 
that all depend on space-time coordinates. The expansion 
becomes similar to the toy model diagrammatically, but the 
loop diagrams now involve integrations over the loop mo- 
menta in the high momentum range. 


Why am | telling you all this, where are we? So far we 
have mapped a rather simple field theory with a cut-off A 
ona much more complicated theory with cut-off bA . What 
is that good for? To see that we return to the scaling prop- 
erties of the terms in the effective Lagrangian, and apply 
dimensional analysis to the new interaction parameters in- 
troduced by integrating out the high momentum modes. 
Let us write, 


Lett = Lo + correction terms, 


where Lo only contains the quadratic terms describing the 
original free field theory. The correction terms in principle 
contain all powers of the field @ and their derivatives. This 
is somewhat disturbing as we now have to deal with an 
extremely complicated effective description, but we are not 
done yet. 


The effective theory has a momentum cut-off bA , and it is 
this theory we want to rescale. We do so by rescaling the 
momentum by k — k/b and the coordinates by x — bx. 
This rescaling of the coordinates brings out certain powers 
of b in front of the terms in the effective Lagrangian. And 
because b < 1 we are going to smaller spatial and larger 
momentum scales. As the Lagrangian has dimension ¢~4 
and [p] ~ ¢—!, one finds that the coupling constants Omn 
for a term with a power of the field m and the number of 
derivatives equal n has to scale with a power of the scaling 
factor b given by: 


gnm ~ prid/2—1 emad a ss 


) 


This expression is consistent with the values we assigned 
before, for example the þf coupling Ay = g4, İn a space- 
time dimension d = 4 yields indeed a power equal zero, 
confirming that A4 is dimensionless. Now we want to dis- 
tinguish three possible cases for the scale dependence of 
the couplings in the effective Lagrangian: 


— power > 0: the term is irrelevant 
— power = 0: the term is marginal 
— power < 0: the term is relevant. 
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At this point we should mention that also extra terms of the 
type that were already present in £o will be generated and 
these terms are absorbed into the new renormalized fields 
and parameters. In the case at hand @ — p’, m — m’ 
and Ay > My so that indeed £o in the next iteration looks 
the same but with renormalized fields and parameters. If 
we imagine iterating this procedure, repeating the rescal- 
ing after integrating out the highest momentum modes, we 
get a sequence of maps of the coupling constants, and 
it is this sequence of maps that we refer to as the renor- 
malization group trajectory. What we arrive at is a flow 
of the model in the space coupling constants. The impor- 
tant point of distinguishing the various terms is that the 
irrelevant ones get suppressed by powers of b, while the 
relevant ones have inverse powers and will grow. For the 
marginal operators one has to make higher order calcula- 
tions to determine which way they will go. The upshot is 
that the renormalization group action maps out a trajectory 
of the given theory in the space of coupling constants, in 
other words, in the space of theories. In the next section 
we will discuss the renormalization group equations that 
determine the trajectories and go through some relevant 
examples. 


Note that the question whether interaction terms are rele- 
vant, irrelevant or marginal depends strongly on the space- 
time dimension d. One can easily check that the * term 
is marginal for d = 4 but it becomes relevant if d < 4. For 
d = 2 one finds that all powers of the field become rele- 
vant, because the exponent becomes —2 for all of them. 
A mass term scales as expected like b*~¢ and is therefore 
relevant for all d > 2. 


The asymptotic behavior of the theory one considers now 
depends on where these trajectories go. They may move 
towards a fixed point that could be either zero or nonzero, 
or trajectories could run off to infinity , which means that 
the theory looses its meaning and becomes inadequate 
to describe the physics. The irrelevant terms go to zero as 
they are suppressed by the increasing powers of the cutoff. 


So most of the scary looking terms that appeared after in- 
tegrating over high momentum modes disappear again be- 
cause of the rescalings, and because of their irrelevance. 
This brings us back to the question of scale invariance. 
If the couplings in a theory go to a fixed point, then the 
theory defined by that fixed point is by definition scale in- 
variant! 


We note that the p4 theory in four space-time dimensions 
has what is called ‘trivial’ fixed point where the parameters 
m? and Aq are both zero, and £”) = (0,,)*. This theory 
is in fact invariant under the conformal group as we have 
mentioned before. It has been shown that the p4 theory for 
d = 3 has a non-trivial fixed point, The so-called Fisher— 
Wilson fixed point. 


The statement is that theories that have only relevant and 
marginal terms are called renormalizable. It is in those 
theories that it is possible to take the cut-off A to infinity 
sending the irrelevant terms to zero. The effect of all the 
quantum perturbations can then be absorbed in sensible 
redefinitions of field and parameters. 


The importance of the Wilson’s renormalization group per- 
spective is that it a priori assumes that there is a real phys- 
ical cut-off and that the physics at lower energy may show 
some dependence on it. This typically is the case in ap- 
plications in condensed matter and you had better take it 
into account. There is no need to send the cut-off to infinity, 
because it is really there. On the other hand it used to be 
somewhat of a mystery if not a miracle why the fundamen- 
tal theories like the Standard Model are all renormalizable 
(from the start). And one wondered why Nature was so 
judicious in its choice. Just to please physicists so they 
could do meaningful perturbative calculations? The Wil- 
son approach makes clear that renormalizability is exactly 
what survives in a natural way. Those are the terms that 
basically survive in the renormalisation group flow. Quite 
arbitrary theories may well flow towards a scale invariant 
fixed point that lies inside a subspace of relevant renormal- 
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izable theories, which do not need to be scale invariant! 
The Wilsonian perspective we have outlined leads to the 
conclusion that the renormalizable models are universal in 
that they describe the asymptotic behavior of large classes 
of other models. E E 


Anion The quantum bank. Whether you 
i G study the stars, write poems, or are 


ANNAS 4 
3 3 world champion armwrestler, in the 


end we all have to deal with banks. You 
need a loan or a mortgage and you get immersed 
in a labyrinth of options: this one looks even more 
advantageous then the other. Ultimately it always 
boils down to interest rates, and those rates are cal- 
culated based on a mysterious mixture of facts and 
fictions concerning the certainties of your present 
and the uncertainties of your future. But one thing 
remains true under all circumstances: borrowing 
money costs money! And you are happy because 
you are spending money you don’t have! 
Now back to quantum, In the realms of quantum 
theory the currency is energy rather than dollars. 
Yet there is also a bank, which is basically the vac- 
uum itself. We know that because of the Heisen- 
berg uncertainty relations, a quantum marble can- 
not be at rest at the bottom of the bowl, it has to 
jiggle around a bit. There is no certainty ever in the 
quantum world. This may work to the advantage 
of the participants in the sense that there are al- 
ways quantum fluctuations even in the ground state 
and even at zero temperature. Quantum reality is 
such that there is always some energy around. And 
the idea of the cooperative quantum bank is that it 
provides very cheap energy loans, but they come 
with some unusual restrictions. The slogan is, you 
can borrow as much as you want but only for a very 
short period. 
Whereas the money banks usually have very high 


interest rates for ultra short-term loans, the quan- 
tum bank’s energy loans work exactly the opposite 
way. As long as you AExAt < 4 you are doing 
fine. So if | am a photon and play it big, | can bor- 
row energy so that | can produce for example an 
electron-positron pair to impress my fellow photons 
as long as the loan is very short term. But now the 
catch is that because the overall energy has to be 
conserved, the quantum bank insists that you re- 
turn your energy before the Federal Reserve gets 
wind of it. And this is what certain real-life Quants 
in real banks don’t seem to understand. There is 
a moment of reckoning: you speculate yourself into 
heaven, but you have to be back home with two legs 
on the ground in time! In other words the quantum 
world makes sure that the pair just created annihi- 
lates back into the vacuum and the photon contin- 
ues its journey, as if nothing ever happened to it. 
You would think. But no it isn’t as simple as that. 
The photon carries its creative banking experiences 
with it and they effectively change its behavior. 

It reminds me of my good old student days at Delft 
University, when | was cycling home late at night 
along the beautiful ‘Oude Delft’ canal from the lab, 
or was it a party? Suddenly | got pulled over by the 
police. Trouble! Probably a costly ticket because | 
had no lights on my bike. And while the officer was 
searching for his ticket book in the car, | shoved my 
old bike in the canal. Bloop...gone! When the po- 
lice officer returned and started to make a solemn 
declaration about ‘your bike sir appears to be miss- 
ing some appropriate lighting’....| interrupted him 
and asked what bike he was talking about. ‘But | 
thought that ... ‘Yes, may be you thought, but look 
... This caused some consternation. Indeed, here 
were powerful fluctuations at work that the officer 
on duty apparently had no working knowledge of! 
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Running coupling constants 


As we have seen, quantum theory and in particular quan- 
tum field theory has come up with a surprising answer to 
questions about the spatial or momentum scale depen- 
dence of the coupling parameters in a given theory. Though 
the road to the result is highly technical and the arguments 
may at first appear to be quite opaque, what results is 
clearcut and strikingly simple. 


The renormalization program yields equations that govern 
the behavior of the parameters of the theory as a function 
of scale. These are differential equations that remind you 
of an ordinary dynamical system, say a set of interacting 
Newtonian particles. Now you have to imagine that these 
equations describe the ‘motion’ of a given theory in pa- 
rameter space, not as a function of time but as a function 
of scale! And that’s where the term ‘running coupling con- 
stants’ comes from. It is kind of mind boggling to think of a 
given theory ‘running’ in the space of theories. Yet that is 
what happens and moreover, it teaches us about the limit- 
ing or asymptotic behavior of such theories. This may be 
in the high momentum (ultraviolet) or the low momentum 
(infrared) limit, depending on the problem one is interested 
in. 


The first and maybe simplest equation of this type — called 
the Gellman-Low equation — was written down for QED. 
Later the general renormalization group approach which 
we described, culminated in the so-called Callan-Symanzik 
equations for the scaling behavior of any composite local 
operators of the type we encountered in the expansions. 
These renormalization group equations govern the flow of 
points in the space of (renomalized) coupling constants 
which we will denote by W. Let us consider the simple 
case of a single coupling constant g. The theory has a mo- 
mentum cut-off A, and the equation involves the renormal- 
ized coupling which we denote by g, which depends on the 
momentum scale through its logarithm only, g = g(logp) , 


where we choose p to be the dimensionless momentum 
variable p = p/A. The renormalization group equation 
has the simple form: 


(II1.4.19) 


This equation just says that the rate of change of the cou- 
pling g equals a function 8(g), not surprisingly called the 
beta-function. This function depends on log*, but only 
through the coupling constant g. In that sense you can 
think of it as a functional equation for g as a function log p, 
in the spirit of equation (III.4.5). 


Let us assume that at some large distance (small momen- 
tum) this coupling is small, then we may look at B for small 
g and develop it there as a power series like: 


B(g) = aĝ +b +, (11.4.20) 


and for small g the successive terms will become ever 
smaller and we can safely truncate the series. Now given 
the quantum field theory the coefficients a,b,--- can be 
calculated using perturbation theory. This approach allows 
us to deduce important general features of the theory. It is 
important though to note that because the beta function 
is mostly calculated perturbatively, it follows that the re- 
sults obtained can only be trusted in the domain where the 
perturbation theory holds, in other words, where the ex- 
pansion parameters are small. Of course the rare cases 
where models can be solved exactly serve as ideal testing 
grounds for the tools we are describing here. 


Mechanical analogues 


Let me point out a mechanical analogy that should be fa- 
miliar and thus helpful. It refers back to our discussion on 
dynamical systems on page 11 of the section on Newto- 
nian mechanics in Chapter 1.1. If we think of a compli- 
cated theory with many parameters, we will have a system 
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of coupled equations but all with the same first derivative 
with respect to logp on the left-hand side. Using log p 
instead of p makes the equations particularly simple, the 
only thing you have to keep in mind is that log p grows 
monotonically with p, so if logp becomes large then p 
does also, but for p going to zero log p goes to minus in- 
finity. In this sense we may think of log p as some kind of 
‘time’ variable t. Then the equation just describes the mo- 
tion of a point in the (coupling constant) space W of the 
system. 


Stated differently, as the point represents a particular the- 
ory, its motion describes a trajectory in a space of theories! 
The left-hand side is the ‘velocity’ or rate of change which 
depends — through the expression on the right-hand side 
— on where you are in the coupling constant space. So the 
equation defines a vector or flow field over W, in a sim- 
ilar way that Newton’s dynamical equations for a particle 
define a flow over the phase space, as we discussed in 
Chapter I.1. The equation governs the trajectories com- 
pletely once the initial conditions for g(t) at some t = to 
are given, just like Newton’s equations do after you give 
the initial positions and velocities of a bunch of interact- 
ing point particles. These dynamical systems are usually 
nonlinear, and also in the case at hand the dynamical sys- 
tem is nonlinear as we see from the expansion in equation 
(Ill.4.20). As we remarked in Chapter 1.1, it allows us to 
search for universal behavior, because the system may for 
large values of time, or high momentum (t = Inj), end 
up in some fixed point or limit cycle and may for long time 
scales exhibit universal behavior. 


It is amusing to see how we manage to address deep 
questions in the realm of Quantum Field Theory because 
we have been able to map the problem onto a rather sim- 
ple Newtonian dynamical system. Indeed, from equation 
(Ill.4.19) one sees that for the points where the B-function 
vanishes the ‘velocity’ is zero, so these points correspond 
to stationary points. This translates into the statement that 
that theory becomes invariant under further rescaling. It 


(a) Stable fixed point (b) Landscape near a stable 


fixed point 


m 


(c) An unstable fixed point 


(d) Landscape near an unstable 
fixed point point 


(e) A saddle point. 


(f) Landscape near a saddle 
point 


Figure Ill.4.15: Fixed points. The three types of fixed points 
one may encounter in a two-dimensional parameter space. In 
(f) we see that lines of steepest descent and steepest ascent 
are perpendicular. 


is a theory which is called ultraviolet (high momentum) 
stable, because it has ended up in in some scale invari- 
ant fixed point. Note that this analysis allows for theories 
that are quite different initially to end up in the same ul- 
tra violet fixed point. They belong to the same universality 
class. 


For a single coupling we have only one dimension and it 
is straight forward to see what is possible for small cou- 
pling, where we only take along a few terms in the expan- 
sion (III.4.20). For example the system may move to a sta- 
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The scalar þf theory. Let us now turn to an explicit ex- 
ample. What do the renormalization group equations look 

UV like for the scalar model we have been discussing? It has 
BA) two parameters, m? and 4, and therefore two equations 
with two beta functions. To lowest non-trivial order these 
read as follows: 
a di 3x2 
d=3 : NM) =—(4—d)Ayt+—4, (1.4.21 
dlog? Ba (A4) ( )\4 ie ( ) 
dm? -5 
= Bm(Ag) = [-2 + Ym(A4)]m (111.4.22) 
dlog p 


Figure lll.4.16: Beta functions of the * theory in 3 and 4 
dimensions. Depending on the starting value of A; we have 
sketched the asymptotic behavior of the coupling constant A = 
A(log p). In the infrared region (decreasing log p) there is a non- 
trivial IR fixed point for A = Ao for d < 4. In the ultraviolet limit 
(increasing log p) we find for ài < ào a trivial fixed point (A = 0) 
where the theory behaves like a free theory with zero coupling. 
For A; > Ao the coupling keeps increasing, at least for as long 
as the expansion makes sense. 


ble fixed point where it would stay for ever after, or we may 
have an unstable fixed point and the system would move 
away from it under any small perturbation. 


If we think of a two-dimensional parameter space, such 
stationary points can in general only have three generic 
types of behavior: either the point is attractive, or repulsive, 
or itis a saddle point. This is illustrated in Figure IIl.4.15. In 
two or more dimensions one also could imagine the pos- 
sibilities of limit cycles, or more exotic attractors where the 
system could display even chaotic behavior. But generic 
features of these renormalization group equations appear 
to exclude that. Nature saves us the humbling demise that 
our theories would get lost in chaotic asymptotics. Cop- 
ing with quantum uncertainties is enough of a challenge! 


Let us make some observations with respect to these equa- 
tions: 

(i) The constant term on the right-hand side of the equa- 
tions gives the naive scaling dimensions, namely 0 and —2 
respectively. 

(ii) The other terms are radiative corrections to the num- 
bers, and are supposed to be small. The anomalous term 
Ym(Aq) vanishes if Ay = 0. 

(iii) As all corrections take the form of a power series in 
A4 only, it is the Aq equation that drives the dynamics. So, 
let us then start with the first equation. In four or more di- 
mensions the beta function is positive and the coupling will 
therefore keep growing until it becomes so large that the 
perturbation series breaks down in that successive terms 
are no longer decreasing. What happens in the strong 
coupling regime in that case cannot be answered through 
this analysis, because the series diverges the approxima- 
tion scheme becomes invalid. One would have to resort 
to strong coupling approximations meaning numerical lat- 
tice simulations. The conclusion appears to be that there 
is no fixed point at larger values of the coupling, which 
means that the theory deteriorates into the quartic term, 
not a physically interesting or meaningful result. 

(iv) In Figure III.4.16 we depicted the beta function Ba (Aq) 
for d = 3 and d = 4. Where the blue direction is the di- 
rection of increasing momentum (ultraviolet), while the red 
arrows point in the decreasing momentum (infrared). We 
see that for d < 4 the beta function has two zeros, mean- 
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ing that there are two fixed points: one at zero and one 
larger than zero at some value A* . The new point is an in- 
frared stable fixed point. 

(v) If we let d approach 4 from below we see that the two 
fixed points merge at the trivial fixed point Ag = O which 
corresponds to the free field theory. 

(vi) Finally the renormalization equation for the mass pa- 
rameter has on the right-hand side the constant —2 which 
just reflects the naive scaling we have discussed already. 
If we scale up the theory one expects the mass term to be- 
come less and less relevant. while in the trivial fixed point 
it is the one and only relevant parameter. The precise form 
of the solution is: 


A 


me Cee: 


ity 


(I1.4.23) 


If in less than four dimensions the system sits in the non- 
trivial Fisher-Wilson infrared fixed point, we get the anoma- 
lous correction to the naive scaling law corresponding to 
y(A*). This correction plays a role in the d = 3 statisti- 
cal physics of magnetic materials. It has no effect on the 
ultraviolet behavior of the theory. 


Gauge couplings 


The following picture emerges: the constant a in the ex- 
pansion of the beta function (III.4.20) has a generic struc- 
ture: 


a=d—-n, (11.4.24) 


where d is the physical space-time dimension and n is 
some critical dimension, critical because the scaling be- 
havior of the theory depends critically on whether d is small- 
er or larger than n. If d < n then a < 0 and the growth 
rate of g is negative and g will decrease with growing mo- 
mentum, or what amounts to the same, with decreasing 
distances. In this case the coupling constant will go to 
zero, its like no interactions are left at small distances. And 
as the linear approximation will get better and better with 


decreasing g, the prediction that this theory behaves as a 
theory of free particles at small distances is reliable and 
consistent. 


If however d > n the situation looks pretty bad because 
now the coupling grows bigger at smaller distances and 
the approximation breaks down and we would need the 
complete B function. 


Now there is still the ‘in between’ possibility with d = n, 
and it turns out to be of considerable interest in the situ- 
ations that nature faces us with. In that case we have to 
turn to the next term in the series with coefficient b . If we 
only keep the b term the solution becomes: 

c 


S 1.4.2 
1—bclogp’ ( >) 


g(p) 


with c some positive constant. Again we may look at what 
happens for when b is positive, respectively negative. The 
different behaviors are plotted in Figure Ill.4.17 and inter- 
estingly we encounter both cases in realistic particle theo- 
ries. 


Quantum electrodynamics. The case with positive b cor- 
responds to pure quantum electrodynamics (QED), the the- 
ory of photons, electrons and positrons. From the blue 
curve we see that g becomes very small for small values 
of logp, that is large distances. We recall that the ex- 
pansion parameter in QED is the fine structure constant 
& = e?/4negc ~ 1/137. This corresponds to the famil- 
iar regime where charges are free and have weak elec- 
tric interactions, and perturbation theory can be trusted, 
and allows for calculations of exceptional precision. The 
decrease of the coupling at larger distances reflects the 
situation that the quantum fluctuations tend to screen the 
‘pare’ or ‘naked’ charge. This effect is called vacuum po- 
larization because due to the quantum uncertainties the 
quantum fluctuations in the fields result in the excitation of 
virtual electron-positron pairs and these screen the ‘bare’ 
charge. Vacuum polarization is discussed in more detail in 
the following section on page 579. 
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Figure III.4.17: Running coupling constants. On the left a plot 
of the bèta function (g) of equation (III.4.20) for positive (blue) 
and negative (purple) sign of the constant b. On the right we 
plotted the solutions for g(Inp) of equation (III.4.25), showing 
the dependence of the coupling strength on the logarithm of the 
momentum. The blue curve goes to infinity for a finite value 
of log p at the so-called Landau singularity. The purple curve 
corresponds to negative b , the coupling tends to zero for small 
distances which is called asymptotic freedom. 


For increasing momenta the coupling becomes infinite for 
some finite momentum scale, which taken literally would 
suggest that the naked charge would be infinite. This sin- 
gular behavior is called a Landau pole, and the presence 
of such a pole indicates that the theory becomes untenable 
past a certain scale and has to break down somehow. In 
general it is true that a coupling growing large is a strong 
signal that the theory is no longer to be trusted past that 
point. This is not a disaster but just a whistleblower an- 
nouncing that the model is losing its validity and presum- 
ably some new physics has to enter the conversation to 
allow us to escape the singularity. 


This illustrates again a notion that | have mentioned be- 
fore, namely that theories are not right or wrong per se, 
but rather have a limited domain of validity. In the present 
context a large coupling usually means that the physical 
system will enter another regime for which the theoretical 
picture one started off with becomes inadequate. Renor- 
malization in that sense helps theories to predict their own 
demise. How nice to have theories which know about their 
own limitations. For the case at hand the resolution came 


much later when it was discovered that at small scales it 
made no sense to look at the electromagnetic interactions 
separately. The remedy was to combine the electromag- 
netic and the weak nuclear interactions into a single unified 
‘electroweak’ theory which turned out to behave extremely 
well also for extremely small distances as we have been 
able to verify in the Large Hadron Collider (LHC) at the 
European accelerator centre CERN in Geneva. In fact the 
word ‘large’ here implies precisely ‘large momenta’, and 
this collider smashes particles into each other with very 
high energies, and that means that they can come very, 
very close to each other. The LHC was specially built to 
investigate what happens to the interactions at very small 
distances. 


Quantum chromodynamics. Let us now look at the pur- 
ple curve in the figure corresponding to negative values of 
b. It shows that the coupling goes to zero with increasing 
log p or at smaller distances. Therefore the theory ends 
up describing non-interacting — free — particles for large 
momenta. This behavior under scaling is realized in Quan- 
tum Chromodynamics (QCQ), the theory for the strong nu- 
clear force. We see that the ‘strong’ interactions between 
quarks, paradoxically enough becomes extremely weak at 
small distances. This remarkable behavior of the strong 
interactions is called asymptotic freedom. For a long time 
it was thought that the problem of the strong nuclear forces 
could never be solved along the lines of quantum field 
theory, but this picture changed drastically after ‘asymp- 
totic freedom’ was discovered and the strong interactions 
were tamed because they turned out to be the manifes- 
tation of a well-behaved weakly coupled theory at small 
distance scales. This totally different asymptotic behavior 
of QED and QCD, is of course due to the self-interacting 
nature of the gluons. Those self-interactions distinguish 
the non-abelian from the abelian theories. For the discov- 
ery of asymptotic freedom the physics Nobel prize 2004 
was awarded to David Gross, David Politzer and Frank 
Wilczek. 


RUNNING COUPLING CONSTANTS 


571 


From the purple curve we also see that going towards 
small momenta the coupling grows ‘without limit’ at some 
finite value of log p . This behavior is sometimes called in- 
frared slavery because the particle would become extreme- 
ly strongly coupled. The physical interpretation of an in- 
creasing coupling constant is that at a scale where the 
coupling becomes of order unity, the perturbative predic- 
tions lose their reliability, and one expects other physics 
and non-perturbative effects to come into play. For QCD 
there are two fundamental phenomena that are linked to 
this. The first is the formation of the quark-antiquark con- 
densate that causes chiral symmetry breaking as we dis- 
cussed in Chapter II.6 on symmetry breaking on page 441. 
This symmetry breaking lead to the interpretation of the 
three pion particles (7 and 7o) as the ‘massless’ Gold- 
stone degrees of freedom associated with the breaking. 
The second non-perturbative phenomenon manifest at that 
scale is the confinement of quarks. As mentioned before 
the collective of quarks reorganizes itself into tightly bound 
composites called hadrons made up of either three quarks 
(called baryons) or form a quark an anti-quark pair (called 
mesons). The protons and neutrons are the nuclear parti- 
cles from which all familiar forms of matter are build, and 
these are baryons. The pions however belong to the group 
of the mesons. What this means is that at scales where 
the becomes large the perturbative approach breaks down 
and the behavior of the theory is no longer what one would 
expect from the its weak coupling behavior. At that point it 
is important to switch to a different effective theory that is 
formulated directly in terms of the hadrons. In the case of 
QCD this turns out to be a nonlinear sigma model that we 
will not further dwell on. 


Grand unification: where strong joins weak 


The idea of renormalization and running coupling constants 
led to a powerful insight into the possibility of unifying the 
different types of interactions into a single framework often 


ae a aa i , 
i with supersymmetry 


unification 


Figure III.4.18: Unifications. The subsequent unifications of 
fundamental interactions suggested by the running coupling 
strength of the various forces meeting at ever higher energy 
scales. Experiments at the Large Hadron Collider at CERN in 
Geneva go up to about 10° GeV. 


referred to as a Grand Unified Theory or GUT. We have al- 
ready alluded to the fact that the problem of the ill-defined 
electromagnetic coupling at small distances was resolved 
by the unification of the electromagnetic and the weak in- 
teractions. On the other hand we mentioned the strong 
nuclear force which turned to become weak at short dis- 
tances. As we explained in Chapter 1.4, these three in- 
teractions are now described in a single combined theory 
called the Standard Model. This theory has so far suc- 
cessfully survived extensive testing through many different 
types of experiments, and appears to be able to predict 
and explain all the data that are available at present. 


To give you an impression of what could be a successful 
next step up the quantum ladder of unification you should 
look at Figure III.4.18. The picture gives the expectation of 
how the grand unification could be achieved. Experiments 
go up to a level 500 GeV so we have witnessed the elec- 
troweak unification and we see the strong coupling com- 
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ing down. Applying the renormalization techniques and 
scaling arguments we discussed before to the Standard 
Model, one may calculate the trajectories of the various 
coupling constants (assuming that no new physics shows 
up at other intermediate scales) to substantially higher en- 
ergy scales and indeed it is suggestive to anticipate a fur- 
ther unification at the GUT scale of around 10!° GeV. In 
fact if we extend the Standard Model to its minimal su- 
persymmetric extension, the resulting trajectories for the 
three couplings of the model really intersect at a single 
point near 10'° Gev as is shown on the right-hand side of 
Figure III.4.18. Then the extra (susy) scale of the break- 
ing of supersymmetry has to be introduced because we 
haven't observed any superpartners of the ordinary parti- 
cles at low energies. Even more speculative would be the 
unification with gravity at the Planck scale 10!’ GeV. Such 
are the grand vistas and holy grails of modern high-energy 
physics. 


Phase transitions 


In the previous sections we have seen that in many body 
systems described by statistical mechanics or quantum 
field theory, we may by changing the external parameters 
being the temperature or some coupling constants have 
the theory end up in a fixed point of the renormalization 
group equations. In points where the beta-function van- 
ishes the theory is scale invariant. We have seen an ultra- 
violet fixed point in QCD and an infrared fixed point in the 
-fourth theory in three dimensions. 


In most of physics this remarkable property of scale in- 
variance is the hallmark of a so-called critical point where 
the system exhibits critical behavior. The behavior around 
such fixed points, may exhibit fluctuations on all scales, 
but these can be understood because of the self-similar 
nature of their spectrum. The correlations display a power 
law behavior . 


The power laws that characterize the critical behavior have 
universal properties which only depend on the dimensions 
and nature of the critical point. Many models which may 
be much more complicated for example having quite a few 
parameters at the start may move into a universal fixed 
points where their behavior is described by a much simpler 
model with fewer parameters. In many lower-dimensional 
cases the critical models can be solved exactly, which pro- 
vides important insights about the phase structure of large 
classes of models, think for example of the Ising model. 
These ideas where initially developed in statistical physics 
by Michael Fisher and Leo Kadanoff, and as mentioned in 
the context of quantum field theory carried further by Ken- 
neth Wilson who received the Nobel prize for his work in 
1982. 


And it is indeed by the renormalization group approach 
that theorists have on the one hand been able to come 
up with many interesting and successful explanations, and 
on the other have been able to construct representative 
models for a myriad of physical phenomena. They could 
solve these simplified models exactly and therefore could 
provide calculable models for a vast range of critical phe- 
nomena. 


So to conclude, we have shown that one may think of a 
space of coupling constants where a given theory is char- 
acterized by some point in that space where the couplings 
take particular values. Now there is a set of coupled renor- 
malization group equations for this set which determines a 
flow of the point through this space that may or or may not 
end up in some fixed point. In a fixed point the system’s 
behavior becomes scale invariant, and as such it exhibits 
some characteristic universal behavior of the theory. The 
renormalization group equations define flow lines in the 
space of parameters and starting at a given point in the 
space the theory follows the flow line to some fixed point. 
Clearly many different theories can end up in the same 
type of fixed point and that is what we mean by universal 
critical behavior see Figure Ill.4.15(a). 
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On the calculation of quantum corrections 
The theory 


Renormalization is a scheme that guarantees a peaceful 
coexistence with infinities. 


JIS 


What do we mean if we say we have a quantum field the- 
ory like QED or the standard model, or a theory of pions, 
or of superconductivity? The casual term ‘The theory’ usu- 
ally refers to anumber of inclusions or formal steps starting 
from three ingredients: 

(i) an action (or Hamiltonian), which allows us to derive a 
set of 

(ii) Feynman rules describing the propagators (two point 
(correlation) functions) for the particles in the theory, and 
also the interaction vertices; 

(iii) If we are interested in a particular physical quantum 

process, we can usually not calculate the probability am- 
plitude for that process exactly. It is however possible to 
make a systematic perturbative approximation, by making 
a diagrammatic expansion for the quantum amplitude of 
any process in increasing powers of the relevant coupling 
constant(s) and in powers of h. 
Such an approximation scheme is only reliable if the ex- 
pansion parameter is sufficiently small. This procedure, 
called perturbation theory, is schematically depicted in Fig- 
ure III.4.19. 


Perturbation theory 


The toy model as tutorial in the language of diagrams. 
Let us take a very simple toy model to illustrate the quant- 
essential difference between classical and quantum rea- 
soning.” The model concerns a drastic simplification and 
only serves to illustrate certain generic properties of quan- 
tum corrections. We are not about to really calculate any- 
thing realistic because it turns out that those calculations 


51 encountered this model in a set of lecture notes on ‘Applications 
of QFT to Geometry’ by Dr Andy Neitze of Princeton University. 


Action 


S= [ifer + p(id + m1 + eX)p}d*x 


Diagrammatic 
expansions in 
coupling and h 


Figure III.4.19: The perturbative approach. A theory (like QED) 
is defined by its classical action, giving the functional form of the 
theory in terms of the particles (fields) and their interactions. 
From the action one derives the set of Feynman rules that al- 
low for a systematic diagrammatic expansion of any physical 
process, This is a series expansion in increasing powers of the 
coupling(s) and the Planck constanth . 


are quite complicated, and it is where a lot of bright stu- 
dents spend a considerable amount of time on. But luckily 
we are not the part of the workforce we are just curious 
tourists! We just want to stare in awe at the statue and 
need not make one ourselves; we love to eat a sausage 
but rather not go through how they are made! We are here 
to see how others did the work! 


The toy model is a field theory in zero-dimensional space- 
time, where we consider two real valued fields ọ and x. 
You could say we are studying a system with two modes. 
The action function has only mass terms and an interaction 
term (there are no space or time derivatives) and looks 
therefore almost trivial: 

2 2 
5 p? + M Etip, 


and let us assume that M > m. You might wonder, what if 


(11.4.26) 


S(@,x) = 
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Figure III.4.20: Action of toy model. The surface corresponds 
to the classical action S(@,x) as a function of the variables ọ 
andy. 


anything we can learn from a model in which such a drastic 
amputation of reality has taken place. In a sense you are 
correct: the fields are just real variables, and the quantum 
aspect as we will see is kind of restricted to the h which we 
stick in. So in the end we integrate a function, and expand 
the result, and yes the structure one obtains looks very 
much like the things we encounter in field theory. This is 
a pedagogical workout, illuminating and even fun. Let us 
therefore respectfully execute some ‘standard calculations’ 
imagining that we are dealing with a real field theory and 
see what it delivers and also what not. 


The three terms in the action correspond to the three Feyn- 
man rules (the elementary diagrams) that we give in Fig- 
ure Ill.4.21. The free part yields the two ‘propagators,’ and 
the quartic term yields the interaction term with coupling 
strength equal A. 


Effective actions. Let us consider the most trivial pro- 
cess imaginable namely where the in and out state are 
both empty. Classically if nothing goes in and nothing goes 


Figure III.4.21: Feynman rules for toy model. The Feynman 
rules are derived from the action and give the functional for the 
various terms in the action. 


out then there is a unit probability that nothing happens in 
between. 


What we have is a very heavy mode and a very light mode 
that interact with each other. What you have classically is 
that it costs a lot of energy to excite the heavy x mode and 
that it is easy to excite the light ọ mode. So for energies 
well below M only the light mode will be present and we 
can forget about the heavy x mode altogether. But if we do 
a quantum calculation we should allow for virtual manifes- 
tations of the heavy mode, and we have to integrate over 
all possible values that field may take. We say that we ‘in- 
tegrate out’ the heavy mode. And this in turn will drastically 
change the resulting effective theory for the light mode. It 
will change three things: (i) it will change the mass of the 
light mode, (ii) it will change the strength of the interac- 
tion term and (iii) it will generate an infinite number of new 
self-interactions for the light mode. These are quantum ef- 
fects that affect the low energy behavior of the theory. And 
these are precisely the generic aspects we like to illustrate 
with this tiny toy model. We can integrate over the the x 
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Figure IIl.4.22: Effective action for ọ field. The graphs corre- 
spond to, (i) the classical action So(@) = (@, 0) (blue curve), (ii) 
the effective action St including the lowest order corrections 
in x (dark purple curve), and (iii) the complete effective action 
where the x field has been integrated out exactly (light purple 
cuve). 


field variable and extract an effective action Sers(@) for the 
¢ field through the defining relation: 


e-Serr(o) M _ | Slo dy. (11.4.27) 
Is this very complicated you may ask? The answer is: if 
you have real space-time dependent fields it is quite in- 
volved, but in our little kindergarten theory, there are no 
evil agents that could spoil our curiosity. As you know the 
action function is just quadratic in ọ as well as x which 
means that the complicated looking formula involves just 
one Gaussian integral over x: 
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In our integral we have that a = (M? + A@?/2)h and we 


1 
)?.. So the effective 


(I1.4.28) 


2 
obtain for the integral (Fa 


Figure Ill.4.23: Effective action expansion. The diagrammatic 
expansion of the effective action for the toy model of equation 
(III.4.26), and the expression for the terms up to order A? and 


hn 


action for the ọ field becomes: 


This logarithm In(1 + b) can for small b be expanded in a 
power series In(1 + b) = b— 4b? + 1b? +... 


Now on a quantum level we are supposed to draw all possi- 
ble vacuum to vacuum diagrams: these are diagrams with- 
out incoming or outgoing lines. Are such diagrams possi- 
ble? Well, yes, of course! We have drawn the first few 
diagrams in Figure III.4.23, where we listed them in pow- 
ers of the coupling constant A and included all diagrams 
up to second order. Applying the Feynman rules given in 
Figure III.4.21, we can in principle write the amplitudes but 
we are in particular interested in the coefficients of the suc- 
cessive terms and the powers in terms of the fields. After 
some algebra you get a result for the sum that is not too 
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Figure IIl.4.24: Effective Feynman rules for toy model. These 
are the effective Feynman rules for the x field if we treat the ọ 
field as an external source corresponding to the green dot. 


surprising either: 


RA & BA oa pi os n 
M29 ~ teme® * agme® T 
(III.4.29) 


Serr(@) = S(p, 0) + 


This is an expression worth contemplating, because it ex- 
hibits many structural features of what quantum correc- 
tions on classical physics look like. We make the following 
observations: 

(i) First of all note that the correction have a factor h so 
they vanish in the classical limit, in the classical limit the 
presence of the x field decouples and it does not affect the 
effective @ theory. 

(ii) The second remarkable fact is that summing over all x 
contributions, which is what integrating the field out means, 
generates self interactions of order n with an effective cou- 
pling constant An ~ R(A/M2)". Most important is the low- 
est order term quadratic in ọ: in other words it will shift the 


mass to m2,, = m + Vee The take home message is, 


that quantum corrections may introduce novel interaction 
terms that were not there on a classical level. 


Figure IIl.4.25: Effective action for @ field This is the effec- 
tive action for the ọ field if we integrate out the high mass x 
field. All diagrams have one loop and thus one power of h, and 
they all contribute to the lowest order quantum corrections. The 
power of the coupling parameter A/2M? is given by the number 
of propagators in each diagram. 


(iii) One might also derive effective Feynman diagrams for 
the ọ field where the higher order terms are represented 
as new couplings Azn labeling the strength of the vertices 
with 2n external lines. This is depicted in Figure Ill.4.26. 
(iv) A fundamental question that remains at this point is 
whether the effective quantum theory can produce interac- 
tion terms that violate the symmetries (and therefore con- 
servation laws) of classical theory. We see an example 
in the toy model above. The effective potential for the ~ 
field has positive coefficients for the ~~ and ° terms but 
a negative coefficient for the 4 term, which means that 
the potential will have local minima at for ọ = 0 but also 
for ọ Æ 0. It would correspond to a metastable state where 
the mirror symmetry @ — —ọ is violated. We briefly re- 
turn to this question shortly when we talk about anoma- 
lies. E 
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Figure Ill.4.26: Effective Feynman rules. The Feynman rules 
for the effective action for the ọ field. The terms are defined as 
zÀ 97" and yield the diagrams as shown. 


Quantum fluctuations in QED 


We pointed out before that we may specify a theory by pos- 
tulating a set of fields representing the basic constituents 
and their interactions by giving the coupled equations they 
have to satisfy or equivalently by giving an energy or ac- 
tion function(al) in terms of them including the interactions. 
Such a model is characterized by its particular functional 
form which contains a number of parameters like coupling 
strengths and masses. These parameters are just the co- 
efficients of the various terms in the action. Of course 
there are also the universal parameters such as the ve- 
locity of light and Planck’s constant, which are hidden as 
we have ‘set’ them equal to one. 


Now you would think that the parameters are directly de- 
termined by making measurements of them. Here we have 


to be careful because the story is not so simple. 


In quantum theories even in the most idealized situations 


one has to deal with the effect of quantum fluctuations, be- 
cause such fluctuations are an inevitable ingredient as a 
consequence of the uncertainty relations between position 
and momentum and time end energy. The size of the en- 
ergy fluctuations grows inversely proportional with the spa- 
tial scale one chooses to look at. So, the theory describes 
also what the fluctuations are in these quantities and if one 
goes to smaller distances or higher momenta the effect of 
these fluctuations is that they will lead to significant differ- 
ences between the bare values of the parameters that | 
wrote down in the equations and those that would effec- 
tively be observed. The parameters are indeed external 
but they are in fact corrected by the quantum processes 
described by the theory. 


To make a consistent comparison with experimental re- 
sults one should first calculate, then include these ‘quan- 
tum’ corrections and then choose the bare parameters in 
such a way that the observed data match the calculated 
parameters including the corrections. It’s like buying a 
box of chocolates, since there may be a significant differ- 
ence between the weight of the box as a whole and the 
net weight of the chocolates, as the wrapping may be sur- 
prisingly elaborate. The lore is that the more exquisite the 
chocolates the more elaborate the wrapping. Reality is 
similarly hidden from us by an elaborate quantum wrap- 


ping. 


The calculations of these corrections turn out to be quite 
involved. What we like to do here is not so much doing 
such calculations as outlining the structure of what they 
involve. And what all that has to do with the scale depen- 
dence of the theory. Briefly stated: if one naively calcu- 
lates these quantum corrections using the diagrammatic 
approach of Feynman, one finds that the calculations di- 
verge, that they give infinite answers. This is not so much 
an indication that things are wrong, but rather that they are 
more subtle than you would naively expect. And Nature is 
subtle for sure. 
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What happens is that if one calculates the effect of certain 
quantum degrees of freedom these cause infinite changes 
in the effective parameters of the theory and that would 
render the theory useless, except when these divergen- 
cies can be ‘subtracted’ in a meaningful and consistent 
way that allows for a set of uniquely defined finite param- 
eters after all. This procedure for dealing in a physically 
sensible way with these unwanted infinities is called renor- 
malization which in turn can be understood in a more gen- 
eral approach called the renomalization group. 


What one learns is that the renormalization procedure im- 
poses serious constraints on the set of couplings or inter- 
action terms one starts off with. Theories satisfying these 
constraints are called renormalizable and you will not be 
surprised to hear that the Standard Model of elementary 
particles and their interactions is a renormalizable gauge 
theory. However, Einstein’s theory of general relativity is 
not renormalizable in the above sense, and the construc- 
tion of a quantum theory of gravity is still best described as 
‘work in progress.’ 


Renormalization. Renormalization amounts to systemat- 
ically extracting the finite quantum corrections to the pa- 
rameters of the bare (classical) theory. It involves a rather 
technical two-step procedure to handle the infinities that 
pop up in the calculations of quantum corrections to masses 
and other coupling constants. The first part is regulariza- 
tion of the divergent expressions. This can be done in 
many different ways, but the simplest conceivable is to just 
introduce a cut-off in momentum space. This means that 
we simply ignore the contributions of very high momentum 
fluctuations. The second part is to introduce a subtraction 
depending on the cutoff, which renders the calculated am- 
plitudes finite. The subtraction involves the introduction of 
counter terms in the action, and once these have been in- 
troduced one can take the limit of the cut-off to infinity. The 
dependence on the cut-off has disappeared and one is left 
with a finite physically meaningful result 


Figure III.4.27: Virtual electron-positron pairs. Vacuum fluctu- 
ations in the electromagnetic field give rise to a cloud of virtual 
electron-positron pairs that effectively screen the ‘bare’ charge, 
and make the effective charge distance or momentum depen- 
dent. 


In practice the contribution of the quantum fluctuations de- 
pends on two things: (i) a momentum cut-off A which in- 
dicates that one only takes into account fluctuations larger 
then a certain spatial scale d > 1/A, and (ii) on how ac- 
curate one calculates the effect of the fluctuations on the 
parameter values of interest. The calculated parameter 
change is encoded in what is called the B function, and 
this function can be calculated to an increasing degree 
of accuracy. We have discussed already examples which 
showed that these technical considerations are crucial in 
determining in which parameter domains one may expect 
results that do or do not make sense. In the following para- 
graphs we will give some remarkable results that will show 
the analytic power of these methods if it comes to under- 
standing the asymptotic (high-energy) behavior of physical 
systems and the theories that describe them. 
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Classical = Quantum 


e --e scattering 


Figure II|.4.28: From classical to quantum process. The Feyn- 
man diagram on the left represents two electrons that scatter of 
each other by exchanging a single photon. This diagram yields 
the classical result. On the right we give the quantum process 
where we have ‘dressed’ the photon propagator with a ‘blob’ 
which means that all quantum corrections have been included. 


A realistic example: Vacuum polarization 


We think of force laws like the gravitational law of New- 
ton or the Coulomb law of electrostatics as specifying the 
strength of a force depending on some charge or mass 
and depending on some variable like the distance and then 
there is also an interaction strength, which is a dimension 
full constant (parameter) to be determined through exper- 
iment. That means we have to measure it at some char- 
acteristic scale and then assume it is constant not only in 
time but also in space. Both assumptions may be chal- 
lenged. It may well be that by going to smaller or larger 
distances the effective coupling constants if one measures 
them would change. 


Let me indicate why the effective coupling might change by 
exploiting some of the intuitive notions we have mentioned 
before. It is clear that if we have a charge for example, 


Classical => 


Quantum 


AW 


Figure IIl.4.29: Quantum corrections. Corrections due to vir- 
tual processes to the photon propagator. The blob has a sys- 
tematic expansion in terms of ever more complex diagrams. Two 
blue dots lead to a factor x = e*/47fic œ 1/137 in the contribu- 
tion of the diagram to the quantum mechanical scattering ampli- 
tude. So higher order terms (in x) become smaller and usually 
we include diagrams up to second order. 


the field around this charge will become ever stronger at 
small distances. The energy density of the field increases 
and may at a certain distance become so big that it be- 
comes possible by Einstein’s E = mc? law, that charged 
particle antiparticle pairs are created near the charge. The 
idea is that the ‘empty’ space is not empty at all but filled 
with electron-positron pairs that form a cloud around the 
charge. This cloud will in fact screen the ‘bare’ charge of 
the electron. This means that at a distance further out we 
see an effective charge that will be smaller than the charge 
we started off with. Translated in the language of the cou- 
pling strength of the charge to the field, we see that it is not 
constant but depends on the scale at which it is measured. 
The amount of screening depends on at what distance we 
look at the charge. We say that the vacuum becomes po- 
larized. As we measure at some distance it is interest- 
ing to ask whether we can find out what the bare charge 
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would be, or what the effective charge at other distances 
would be. The coupling constants may be on the run but 
where are they going? Is it going to be ever smaller or ever 
bigger and maybe become infinite? In theories of various 
sorts describing different types of interactions many differ- 
ent scenarios present themselves including the possibility 
that bare parameters that were chosen to be zero become 
non-zero due to these quantum fluctuation effects, which 
basically amounts to saying that the theory itself acquires 
new extra parameters that you didn’t put in at the start! Un- 
der certain circumstances theories are apparently capable 
to ‘improve’ upon themselves. One might say that taking 
this kind of background into account provides insight in the 
range of validity of the theory one started off with and that 
is certainly a remarkable conclusion that deserves a closer 
look. 


A divergent diagram. Let us consider the two point func- 
tion for the photon. In the second line of Figure III.4.29 we 
have drawn some Feynman diagrams that describe pro- 
cesses that contribute to the propagator or two-point func- 
tion for the photon. In the first, the wiggly line is just the 
bare propagator which in momentum space is just given 
by the expression: 

S(k) ~ z 
So this describes a mode with momentum of the photon 
propagating between two space-time points. The second 
diagram with two interactions where an electron-positron 
pair is created and subsequently annihilated. It is a so- 
called virtual process because there are no external lines 
connected to the closed fermionic loop. Now momentum 
is conserved in the interaction points so overall that means 
thatbthe the momentum carried by the ingoing photon must 
be the same as that carried by the outgoing photon, and 
at the vertex it implies that if the electron created has mo- 
mentum p, then the positron has to have momentum k—p. 
If we just do the counting of powers of p, the propagator of 
the electron yields a factor 1/(p — m), and the positron a 
factor 1/(k+p—m). The problem arises because we have 


to sum or integrate all possible amplitudes, which means 
all possible values of the momentum p going around the 
loop. so we have to calculate an integral 


i 
EET 


j pdp ~ A? 
For large p the dominant contribution comes from 


[e dp = |p dp = oo. 


In other words, the integral behaves badly and is diver- 
gent! This is bad news because we know that physical 
amplitudes and probabilities are finite. What we need is a 
way to manage the deluge of infinities popping up in our 
calculations in such a way that physically meaningful re- 
sults are obtained The infinities have to be artefacts of our 
calculational methodology otherwise the theory makes no 
sense. 


This leads to the intricate protocol called renormalization 
that we have mentioned before. It refers to the three step 
procedure, where we first regulate the divergencies, then 
subtract the would be divergencies, which allows to rede- 
fine or renormalize the fields and parameters in the theory 
in a consistent and unique way. 


Regularization and renormalization. The first step we 
take is to in some way regulate the divergent integral by in- 
troducing a high momentum cut-off, meaning that we limit 
the momentum range we integrate such that p < A. Then 
the leading term will be quadratic in A as indicated in the 
equation above. 


Once you have applied such a regularization to all the di- 
vergent expressions, renormalization means that you ap- 
ply a well-defined procedure to subtract the divergent ex- 
pressions in a consistent way that leaves you with unique 
finite results for the quantum corrections to any diagram 
with given external lines. However, there are only a finite 
number of renomalizations (correction factors) you can im- 
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plement in a given field theory; you can basically renormal- 
ize the fields, the masses and the coupling constants and 
that’s it. So for electrodynamics you could at most accom- 
modate two field, one mass, and one coupling constant 
renormalization. There are dependencies between them 
and one is left with three correction factors, Z,,Z2 and 
Z3 associated with the renormalization of the wave func- 
tion(s), the charge and the fermion mass respectively. The 
fact that QED is renomalizable means that if you calculate 
all diagrams for all amplitudes to arbitrary order in the cou- 
pling constant, all divergencies that you will ever encounter 
can be absorbed in those three constants. This is by no 
means obvious; it means that the theory has to meet cer- 
tain exquisite requirements We will have more to say about 
what that means and which generic properties determine 
whether a theory is renormalizable or not. 


Let us reflect for a moment on what the above technical 
rather magical manipulations have to do with the main sub- 
ject of this chapter which is ‘scaling’ and ‘scale invariance. 
It is quite clear that once you introduce a cut-off or any 
other way to regularize the theory, then that will break any 
form of scale invariance, precisely because we explicitly 
introduce a scale in the theory ‘by hand. And though the 
results claim to be independent of the particular value of 
the cut-off, renormalization nevertheless deeply affects the 
high-energy asymptotic behavior of quantum field theories 
and in particular spoils the scale invariance one might have 
expected. 


The cut-off and the subtraction point 


The role of the cut-off is rather profound. With a bit of 
common sense one would say: of course there ought to 
be a cutoff because the theory may not be fit to describe 
fluctuations in the medium below a certain scale. Think 
of a fluid which on a macroscopic level is a continuum, 
but if we go down in scale we know that it is ultimately 


just a collection of molecules and on that scale the con- 
tinuum assumption is certainly a bad one. Evidently in 
such a case itis the interatomic separation in the liquid that 
sets the scale for the distance cutoff d ~ 1/A. Let us now 
turn to the all-important question of the accuracy of the B 
functions, i.e. the functions that describe the scale depen- 
dence of the effective parameters in the model. The argu- 
ments became rather subtle to a point where even the sci- 
entist themselves became utterly surprised by the success 
of their calculations. What happened? In many cases the 
difference between the measured quantities and the cal- 
culated ones grew ever larger with increasing momentum. 
And indeed new parameters had to be introduced in the 
bare energy function. What one did was to just introduced 
so called counter terms also depending on the cutoff in- 
troduced that cancelled the calculated effect and after that 
let the cutoff go to infinity (or zero), so that the difference 
ended up being finite and independent of the cutoff. The 
physicists developed a well defined procedure, or maybe 
we should call it a calculational trick, called renormaliza- 
tion that would lead to predictions free of ambiguities, if 
and only if after some given order in the approximation 
scheme of the beta function no new parameters had to be 
introduced. That means that after a certain point the num- 
ber of parameters of the theory would stay fixed and finite. 
Renormalization would then only change those parame- 
ters, and that was considered admissible from a physical 
point of view, though mathematically one was kind of jig- 
gling infinities to fabricate finite numbers that should fit the 
experimental data. 


But as usual the proof was in eating the sausage with- 
out advertising too much what went in it. And the results 
turned out to be splendid and the renormalization meth- 
ods allowed us to calculate many new physical effects with 
exceptional precision. For example the pinnacle of such 
calculations is the high order calculation of the anomalous 
magnetic moment of the electron which matches experi- 
ment up to 11 significant digits! Now that is what one calls 
hard science! 
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Figure III.4.30: g — 2 diagrams. Some of the tenth-order di- 
agrams contributing to the calculation of the anomalous mag- 
netic moment of the electron or muon. (Physical Review Let- 
ters109.111807, 2012) 


In 1987 the experimental measurements (R. S. Van Dyck, 
Jr., P. B. Schwinberg and H. G. Dehmelt) reached the un- 
believable precision: 


= (g — 2)/2 = 1159652188.4(4.3) x 10712. 


The heroic QED calculation to the tenth-order in pertur- 
bation theory involving 12,672 diagrams performed by the 
Japanese team of Aoyama, Hayakawa, Kinoshita, and Nio 
produced the theoretical value: 


de(theory) = 1159652181.78(77) x 10°", 


which was published in 2012. To give you an idea of what 
this looks like we present some of the tenth-order diagrams 
in Figure III.4.30. 


Anomalies. If regularization violates the symmetries of 
the classical action, we produce anomalies. The would-be 
conserved current is no longer conserved, the divergence 
of the current is no longer zero but there will be an anoma- 
lous source term in the quantum version of that law. So 


the question is how serious that is. What it means that 
in the quantum real world we would see processes that 
violate some naively expected conservation laws. For ex- 
ample there is a famous decay of a neutral pion 7 into 
two photons the would be forbidden but actually has been 
observed, so such anomalous processes do occur. 


Now there is one important restriction here, as we have 
argued, gauge symmetries lead to electric or color charge 
conservation and it is known that if we break local gauge 
symmetries, that leads to severe inconsistencies and the 
theory would become non-renormalizable. So in the first 
place we have to make sure we have a gauge invariant 
regulator. However, that may not be enough, and one has 
to make sure to adjust the particle content of the theory 
such that the contributions of the different particle species 
to the anomaly cancell. This has indeed led to the con- 
straint of the family structure of the Standard Model. If the 
particles appear in what we called ‘families’ than the can- 
cellation of all gauge anomalies is guaranteed. 


As a matter of fact here again it is the gravitational in- 
teraction which is after all a gauge theory which has a 
gravitational anomaly, which makes the ‘naive’ perturbative 
quantization of Einstein’s general theory of relativity a well- 
established night mare! In fact it is exactly why an anomaly 
free gravity theory pops up in string theory. It turns out 
that the gravitational anomalies cancel in ten-dimensional 
space-time, where strings supposedly live. 
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Further reading. 
On scaling, renormalization and critical phenomena: 


Fractals 
John P. Briggs 
Touchstone Books (1992) 


Quantum Field Theory 
David Skinner 
Cambridge University (Lecture notes) 


An Introduction To Quantum Field Theory 
Michael E. Peskin and Daniel V. Schroeder 
CRC Press (1995) 


The Theory of Critical Phenomena: An Introduc- 
tion to the Renormalization Group 

J.J. Binney, N.J. Dowrick, A.J. Fisher and M.E.J. 
Newman 

Clarendon Press (1992) 


Phase Transitions and Renormalization Group 
Jean Zinn-Justin 
Oxford University Press (2013) 


Complementary reading: 


— The Fractal Geometry of Nature 


Benoit Mandelbrot 
W. H. Freeman and Co. (1982) 


Fractals: Endlessly Repeated Geometric Figures 
H. Lauwerier 
Princeton University Press (1991) 


M.C. Escher: Art and Science 

H.S.M. Coxeter, M. Emmer, R. Penrose and M.J. 
Teuber Eds 

North-Holland (1986) 


The Mathematical Side of M.C. Escher 
Doris Schattschneider Article in Notices of the 
AMS, Volume 57 nr 6 (2010) 


Scale: The Universal Laws of Growth, Innovation, 
Sustainability, and the Pace of Life in Organisms, 
Cities, Economies, and Companies 

Geoffrey B. West 

Penguin Press (2017) 


Nature in search of itself. 


Science is a deeply human endeavor, as it re- 
quires the unique combination of basic capa- 
bilities like curiosity, reason, intuition, creativ- 
ity and collaboration. It expresses the collec- 
tive curiosity of mankind and has resulted in 
the double helix of science and technology that 
keeps transforming our world over and again. 
It embodies a cumulative, evolutionary process 
that continuously creates new options for soci- 
ety while at the same time forcing it to face the 
severe ethical dilemmas that come along. 


All of us have witnessed how science has pro- 
foundly affected the human condition and trans- 
formed society, and how in many instances it 
managed to transcend man’s painful political, 
ethnic, and religious differences. As such it is 
a true cornerstone of civilization. At least as 
long as we can ensure that it does not fall prey 
to all kinds of abuse by dark forces bent on 
power and financial or political gain only. 


If knowledge is our destiny, then that feeds the 
hope for carving out a gateway to a common, 
global understanding of the world and our op- 
tions for governing it. It could lead the way 
towards an inhabitable future for all of us. 


Chapter III.5 


Power of the invisible 


Im ganzen habe ich jedenfalls erreicht, was ich er- 
reichen wollte. Man sage nicht, es ware der Mühe 
nicht wert gewesen. Im Ubrigen will ich keines Men- 
schen Urteil, ich will nur Kenntnisse verbreiten, ich 
berichte nur, auch Ihnen, hohe Herren von der Aka- 
demie, habe ich nur berichtet. 
Franz Katka, in Bericht fur eine Akademie ! 

In this concluding chapter we briefly recapitulate our jour- 
ney through the quantum wonderland. It is a kind of mir- 
ror image of the introduction. The difference is that with 
the knowledge we have acquired along the way there is 
more room to reflect on the places we visited. This also 
means that there is some room for more subjective state- 
ments. 


A pillar of wisdom? The cartoon on the right by Pete 
Ryan appeared in the New York Times. For me it is an 
ironic pillar of wisdom depicting not only the wisdom it- 
self, but also our winding roads towards it. That process 
starts in quite an orderly way at the bottom with a number 
of parallel strands going straight up. At some point you 
start wondering why the strands go up so perfectly straight 
and parallel. And as soon as you start to question the 


On the whole, at any rate, | have achieved what I set out to achieve. 
But do not tell me that it was not worth the trouble. In any case | am not 
appealing for any man’s verdict, | am only imparting knowledge, | am 
only making a report. To you also, honored Members of the Academy, 
| have only made a report. (translation: Willa and Edwin Muir) 


Daye 
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ESS) 


c 


Figure III.5.1: A modern pillar of wisdom? (Source: Pete Ryan, 
NYT, Jan. 7, 2022) 


given narrative things start to diverge. The lines start wig- 
gling and before you know you are caught up in a huge 
entanglement, a huge confusion, a spaghetti like mess of 
doubt and contradiction. How to move forward? How to 
get out of this mess? And yes, every time, as by some mir- 
acle you manage to surmount the problems and look what 
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happens: things come together again, and they coalesce 
into a new perception of reality symbolized by the beauti- 
fully ornamented capital. Like a crown on your labor. You 
managed to beat the minotaur hidden in that labyrinth of 
strands. 


| suppose the artist has forgotten about the many dead- 
end streets that are also part of the tangle. Maybe the 
artist was inspired by the path integral approach to wis- 
dom where only paths from A all the way to B have to be 
included. In fact all of them have to be included, not just 
the shortest or the most beautiful, but also the less obvi- 
ous, maybe obscure and low probability routes. Anyway, 
after working your way through all the possible paths you 
are bound to end up at a next level of knowledge and un- 
derstanding. Yet another shoulder of a giant to stand on, 
yet another step on Cantor’s devil’s staircase to ultimate 
knowledge and may be wisdom.... 


Summary and outlook 


To see the power of the invisible in a way that supersedes 
blind faith, it has to be made discernible first. And that 
is what empirical science is about, inventing the observa- 
tional tools that allow us to see those things that have al- 
ways been there, but were hidden from the naked human 
eye. Why worrying about the invisible, you may ask, as 
long as the visible suffices to keep us busy and to fully 
occupy our fragile minds? Curiosity to know what may or 
may not be beyond what we can see is one of the ultimate 
drivers of our existence, of discovery, and in the long run 
of understanding, reason and survival. 


Who am I? If you would ask me who | really am, | may 
start by telling a nice story, probably a dressed up CV of 
some sort centered about my major accomplishments. In 
certain cases | may even disclose some personal details. 
And if you keep pushing me, it may turn into a narrative 


about my childhood, my family and its traditions. And by 
talking about family treats | have, without mentioning, en- 
tered the realms of heredity and of genetics. The narra- 
tive loses some its ultra personal features and turns into 
a more generic, though still fully anthropocentric, perspec- 
tive. | will for example not mention that features like my 
sense of humor, or need to physically be in touch, or my 
habits of impressing others, or getting enraged about futil- 
ities, probably go all the way back to my primate or for that 
matter rabbit-like ancestors. 


You understand what | am driving at: the deeper | search 
myself and the world in which | live, the less personal the 
story becomes, the more abstract it will be, and the less it 
will refer to the plainly visible or the specifically human. If 
your interrogation were to go on indefinitely, | might just jot 
down some quantessential formulas in the end. And that is 
how the science of the invisible enters our conversations 
as a relevant resource of reliable knowledge, leaving the 
limitations of anthropocentricity behind. Maybe that is the 
power of the invisible. 


The mission of physics. Physics is an empirical sci- 
ence which concerns the art of making discoveries through 
making ever more sophisticated observations. It wants to 
know what nature looks like and how it works on all scales. 
We have to admit that it certainly paid off when Galileo 
supposedly threw stones and wooden balls from the Pisa 
tower and carefully listened to them hitting the pavement! 
We make progress by building models and improving on 
them. The models are supposed to not just fit data but 
more in particular to explain the different patterns of data 
by relating them through causal relationships expressed 
through mathematical equations. 


On all scales there is the question what the relevant de- 
grees of freedom are, and to understand their behavior, 
like structure formation through binding or a particular dy- 
namics, we need to understand the interactions between 
these relevant constituents. Dynamical processes are gen- 
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erated by interactions or forces between constituents and 
that makes their overall effect often hard to predict, exactly 
because coupling systems introduces feedback loops. This 
is a general feature that holds both on the classical and 
quantum level. It is particularly true for many particle sys- 
tems, but as we have seen, it also holds for space-time. 
So, let us once more look at the quantessence at large. 


This book’s approach. This three volume book is a some- 
what experimental and ambiguous ‘go in between’ in the 
sense that it tries to interpolate between a ‘laymen ac- 
count’ and a — | hate to say this — ‘textbook’ of a sort. Is it 
possible to go in-between without losing two audiences at 
once? My publisher will undoubtedly let me know immedi- 
ately, | am sure! 


Another question that crosses the mind is whether all those 
Wikis make books like these not obsolete? | think the an- 
swer is a firm ‘no’ and would claim the opposite. These 
books attempt to be more than a encyclopedia and give 
a coherent account of large range of topics that together 
form a huge subject in science. The aim is to provide a 
critical guidance for which items out of the small infinity 
of Wikipedia entries are actually relevant if you want to go 
quantum. | can only hope that these books did indeed give 
you an informed steer on when and where to go for addi- 
tional Wiki-wisdom, and what the keywords were to look 
for. 


It’s the math, stupid! In confronting quantum realities this 
could be the analogue of the political maxim ‘It’s the econ- 
omy, stupid!, that was coined by the American political an- 
alyst James Carville in 1992. He wanted to emphasize 
that even the most basic knowledge of economy would 
stop people from making absurd claims about everyday 
economic realities. We have used a lot of mathematical 
language mainly in order to keep the arguments transpar- 
ent and unambiguous and to prevent us from committing 
crimes against logic. But we softened our approach by 
paraphrasing the math with lots of prose as to keep the 


story accessible. However, making that choice we sac- 
rificed a principal asset of mathematics, namely, that it is 
extremely concise and allows you to make precise yet brief 
arguments. The true aesthetics of mathematics is deeply 
rooted in this idea of eliminating all the unnecessary. In 
that respect math is the opposite of show business: no 
window-dressing allowed. We exploited the unambiguous 
and transparent character of the mathematical formalism, 
but at the same time blurred its purity by — in parallel — 
talking extensively about what it means and using lots of 
illustrations. We immersed our math formulas in the ‘un- 
necessary’ to keep them accessible and part of the conver- 
sation. You could say that we fell back on show business 
after all. 


The three track narrative. In an attempt to help overcome 
the common fear of formulas, and keep the contents man- 
ageable | adhered to a storytelling philosophy where the 
narrative followed three tracks in parallel. The first was a 
pictorial one, as | included over 450 illustrations, the sec- 
ond was the rather extensive use of equations, and the 
third track consisted of extensive prose. The latter is there 
in its own right, but also to bridge the gaps between pic- 
tures and formulas. The interplay between these tracks 
hopefully allowed you to grasp this wonderful body of fun- 
damental knowledge in the heart of science. | am con- 
vinced that it made you at least ‘conversant’ about the 
quantessence of things. 


The quantessence in retrospect. 


Let us briefly look back at the three volumes that make up 
this quantum trilogy with Figures III.5.2 in mind. The rea- 
son why this trio has such a wide scope is the plain fact 
that quantum theory is a general set of principles that na- 
ture appears to obey on all scales, at least as far as we 
have been able to test. It applies to different types of sys- 
tems, where the translation of the fundamental quantum 
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principles get a different mathematical implementation and 
outlook. 


Three volumes. 


Volume I. In the first volume we have devoted quite a bit of 
time and room to provide a wide background by recalling 
the basic concepts of classical physics. This in order to 
provide a setting in which the quantessential parts of the 
subsequent volumes stand out more clearly. 


In Chapter!.1 we briefly reviewed of the central achieve- 
ments of classical physics. And in Chapter 1.2 we extended 
that with the basics of relativity, geometry, and classical in- 
formation theory. 


Chapter 1.3 looks at the universal constants of nature and 
what their meaning is. We showed how through dimen- 
sional analysis these constants set natural scales linked to 
certain classical and quantum phenomena. 


In Chapter 1.4 we descended the quantum ladder in a sys- 
tematic way from the atomic scale down. This culminated 
in a description of the Standard Model for the elemen- 
tary particles and the fundamental forces between them. 
We then continued with excursions into the speculative 
domains of supersymmetry and string theory as possible 
approaches to a consistent quantum theory that includes 
gravity: a quantum theory that would unify matter, radiation 
and space-time. 


Volume Il. In the second volume we introduced the math- 
ematical framework and mostly applied it to basic systems 
like qubits, electron spins, particles and simple field theo- 
ries. 


In the Chapter Il.1 we discussed concepts like the Hilbert 
space of states, a vector space where the linear super- 


position principle holds which quite directly leads to the 
possibility of entangled states which are uniquely quan- 
tum. These states lead to intriguing paradoxes like ‘Schré- 
dinger’s cat’ and the EPR paradox, but at the same time 
opened the possibility of quantum teleportation and quan- 
tum key distribution. 


In Chapter 11.2, we introduced the observables as oper- 
ators acting on Hilbert space. This identification led to 
quantessential notions like the incompatibility of observ- 
ables, which in turn give rise to the fundamental uncertain- 
ties as expressed by Heisenberg’s uncertainty relations. 
We also went into various aspects of particle-wave duality, 
leading to particle interference phenomena as discussed 
in Chapter II.3. 


We demonstrated that the vastly different quantum setting 
allows for a new type of information processing and com- 
puting with a far-reaching technological potential. This is a 
major challenge and has become a high priority effort for 
the worldwide community of quantum condensed matter 
physicists. And in parallel to the struggle to produce scal- 
able and reliable hardware there is now also a booming 
branch of quantum software developments. 


In Chapter II.5 we explored a topological argument for the 
exclusion principle and the spin/statistics properties of quan- 
tum particles. 


Symmetry considerations play a central role in all fields of 
modern physics and chemistry. We therefore concluded 
the second volume with a chapter entirely devoted to the 
meaning and quantum implementations of symmetry and 
its breaking. 


Volume Ill. In the third volume we showed how the phys- 
ics of the early cosmic evolution in an expanding and cool- 
ing universe is completely governed by the quantum laws. 
The resulting structural hierarchy of matter reflects how the 
various fundamental forces played dominant roles in suc- 
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(a) The book at large. 


Figure II|.5.2: Book summary. The quantessence in retrospect. 


cessive stages of that evolution. 


After discussing the basics of molecular (chemical) phys- 
ics, we turned to the many-body physics of condensed 
states of matter. First we described the types of order that 
the substrate of atoms or ions may exhibit, like crystal lat- 
tices of all sorts and their symmetries. We also consid- 
ered the defects or imperfections that may form in such 
highly ordered states of matter. These defects often carry 
quantum numbers that are conserved for topological rea- 
sons. 


In Chapter Ill.3 we turned to the electron collective and 
how that gives rise to many surprising quantum phenom- 
ena like various types of conductivity and magnetism, from 
semi- to superconductors, quantum Hall states etc. It is 
amazing to see how many novel states of matter are pos- 
sible in the quantum regime. 


We closed our quantum excursions with Chapter Ill.4 on 
the properties of scaling, first in the realms of geometry 
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and then in context of dynamical systems. In the quantum 
regime leading the notion of renormalization, which boils 
down to a systematic scale dependent redefinition of the 
parameters that define the model. In quantum field theory 
this stands for sophisticated procedures of juggling with 
infinities leading to a state of peaceful coexistence with 
them, by producing unambiguous finite answers. In addi- 
tion to the understanding of phenomena like the confine- 
ment of quarks, the renomalization group approach pro- 
vided a powerful approach to critical phenomena in gen- 
eral. 


Three layers. 


Layer A: Down and up the structural hierarchy. There 
is a subtle difference between the first column of the Fig- 
ure III.5.2(a) referring to the Volumes and Figure III.5.2(b) 
referring to the layers. In the first figure the arrows are 
pointing downwards from the atomic scale to the scale of 
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quarks and leptons, while in the second they are all point- 
ing upwards. We recall that in Chapter 1.4 we followed the 
quest for ever more fundamental building blocks of mat- 
ter indeed following the arrows down. However, in Chap- 
ter lll.1 we did the opposite, and followed the time path, 
not of the human quest, but of the true cosmological his- 
tory by showing how the hierarchy of matter starting from 
the Big Bang all the way up to the molecules of life came 
into being. This perspective was of course forced upon us 
after understanding the evolution of space-time according 
to the Big Bang scenario described by the theory of Gen- 
eral Relativity. 


Layer B: The hierarchy of mathematical realizations. 
The mathematical realizations of the basic quantum prin- 
ciples are shown in the second column of the figure. Sim- 
ply stated, the system one considers defines what the ba- 
sic degrees of freedom or dynamical variables are. Given 
the Hamiltonian one may then define the operators for the 
‘coordinates’ and conjugate ‘momenta’ and postulate their 
canonical commutation relations. The structure of the cor- 
responding Hilbert space of quantum states then follows. 
If the system cannot be solved exactly which is mostly the 
case, one usually starts from the non-interacting system, 
and uses that as the starting point for a perturbative ap- 
proach of the system with interactions. 


What the middle column shows is that at the bottom of the 
hierarchy, the most elementary quantum system is in fact 
the gubit or the spin-1/2 degree of freedom, with its two- 
dimensional Hilbert space. This system was extensively 
analysed in Chapters Il.1 and II.2. 


One step up we have the framework of quantum mechan- 
ics for a single particle, typically in an external potential 
leading to an infinite-dimensional Hilbert space of normal- 
izable wave functions. These notions were introduced in 
Chapter 1.4 in the section on ‘Atomic structure’ and we re- 
peatedly returned to this topic in the second volume, and 
in particular in Chapter II.5. 


At the next level of generality we include special relativ- 
ity which forced us to move from quantum mechanics to 
the framework of quantum field theory. Here the fields 
and their conjugate field-momenta are the basic degrees 
of freedom, leading to the multi-particle Hilbert space. This 
framework centers around field operators that allow for 
the creation and annihilation of particles and therefore al- 
lows for the implementation of the famous equivalence re- 
lation E = mc?, for example as we see it in processes 
like pair creation and annihilation in QED. Field theory is 
the language of the Standard Model, but also for most of 
condensed matter physics. Quantum field theory is intro- 
duced in Chapter 1.4 in the context of the Standard Model. 
We returned to some of the formal aspects in Chapter II.5, 
and apply it to the electron collective in Chapter III.3. Fi- 
nally, the scaling and renormalization aspects of field the- 
ory were discussed in Chapter III.4. 


A yet more general framework would allow for the con- 
sistent inclusion of general relativity: in other words the 
inclusion of the gravitational force implying the quantiza- 
tion of space-time itself. This mission is not completed yet. 
The most advanced models of this type are the superstring 
theories which we described towards the end of Chap- 
ter|.4. In this framework each string mode corresponds 
to a different quantum field. The string idea therefore uni- 
fies all fields and thus all particle types into a single the- 
ory. This theory has certainly deepened our understand- 
ing of the quantum properties of gravity, like black holes 
and resolved some of the outstanding paradoxes, but the 
theory has not yet led to unique explanations of observed 
phenomena like dark energy. And the predictions it does 
make, like the 10-dimensional structure of space-time, or 
the existence of a myriad of super particles, have not (yet) 
been confirmed by experiment. 


Layer C : Quantum concepts and their meaning. The 
third layer shows how the mathematically consistent frame- 
work raised a number of conceptual issues physics had to 
face. These issues concern the question of how to inter- 
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pret the core of physical reality. The subtitle of the book is ators which by acting will in general change the state. The 
‘The quantessence of reality’ because that quantessence fact that observables are no longer real-number-valued vari- 


has been shaking the foundations of many of our cher- ables like in classical physics immediately leads to the prob- 
ished beliefs about what seemed to be self-evident fea- lem of what a measurement exactly means. In the Copen- 
tures of reality, features reflecting our classical intuitions. hagen interpretation it means that the measurement out- 
These intuitions concern what the properties of physical come is a probabilistic one and furthermore that the act 


systems were supposed to be, and what the role causality of measurement will generically change the state of the 
and predictability in their mathematical framing amounted system. There is no longer a strict separation between ob- 


to. What we have learned in a century of quantum devel- ject and subject when observations are made. We can no 
opments is that these changes are radical and will be long longer predict precisely what happens but can only calcu- 
lasting. late the odds. This in turn means that we leave the notion 


of classical determinism behind. Quantum means indeter- 
Starting at the bottom of the third column of Figure III.5.2(b), minism. 
we see the that the structure of the space of states of 
any quantum system is a vector (Hilbert) soace, mean- Another important consequence of the fact that observ- 
ing that the superposition principle holds, and that phys- ables are operators is that they do not necessarily com- 
ical states are represented by normalized vectors. If we mute. The outcome of their successive action on a given 
combine subsystems the total Hilbert soace becomes to vector may depend on the order in which you apply them. If 
the (tensor) product space, implying that the dimension of the operators do not commute, the corresponding observ- 


the total space is the product of the dimensions of the sub- ables are called incompatible. This incompatibility lies at 
spaces. This structure implies the existence of entangled the root of the intrinsic quantum uncertainties in measure- 
states, which are states that correspond to normed vec- ment outcomes so beautifully encoded in Heisenberg’s un- 


tors in the total space that are not factorizable, that do not certainty relations. 

correspond to a direct product of two vectors in the sub- 

systems. The structure of quantum reality also implies that we can- 
not copy a quantum state while keeping the original, this is 

Entanglement allows for the possibility of strong, very quant- known as the no-cloning theorem. However, what is pos- 


essential, instantaneous correlations between outcomes sible is to transfer a quantum state from one system to an- 
of measurements separated by arbitrary large distances. other, and because of the entanglement property this can 
This led to a profound debate often referred to as the Bohr- in principle be done instantaneously over arbitrary large 


Einstein debate about the locality and causality properties distances. This possibility of quantum teleportation turned 
of physical reality. Experiments like the GHZ experiment the entanglement property into a blessing in disguise. It 
that we discussed in Chapter Il.4 convincingly showed the enables another level of cyber security in data transfer. 
quantum interpretation to be correct. 

Further conceptual consequences are evident if one thinks 
Moving one step up in the column we mention that the about the quantessentials from the point of view of infor- 
mathematical structure of quantum mechanics implies that mation. The quantum states allow for storage of infor- 
observables should be interpreted as (bounded) opera- mation, and this naturally leads to the introduction of the 
tors acting on vectors in Hilbert space. These should be gubit as the quantum analogue of the digital bit. Quantum 
thought of as (finite or infinite) matrices or differential oper- mechanics allows for unheard possibilities to process this 
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quantum information. We see all around us that a major 
quantum information revolution is on its way, a revolution 
that both on the hard and software side will radically trans- 
form our computational abilities. 


A final radical ingredient of quantum reality manifests it- 
self if one studies the collective behavior of many parti- 
cle systems. First of all because particles of a certain 
type correspond to basic modes of a single quantum field, 
they are indistinguishable, they have a family name but 
no first name, so to speak. In addition there is the pos- 
sibility of exclusion, saying that there cannot be more than 
one particle in a given quantum state. This verdict is an- 
chored in the quantum interpretation of the Dirac field. We 
addressed these fundamental properties of quantum par- 
ticles in Chapter 11.5, and linked them to the topological 
properties of the two-particle Hilbert space. Indistinguisha- 
bility and exclusion each modify the statistical properties of 
many body systems and create entirely novel possibilities 
for the physical states of these systems. These possibili- 
ties have made quantum condensed matter physics into an 
inexhaustible source of technological innovations. 


Altogether the beauty of the conceptual notions which sur- 
faced in the third layer are a direct and therefore necessary 
consequence of the basic logical structure of quantum the- 
ory. There appears to be no way around them and more 
and more we start to appreciate how they enriched and 
broadened our perception of the roots of reality. They em- 
body a true revolution in our understanding of the physical 
universe that found its translation into powerful new tech- 
nologies that radically transformed our daily lives, and will 
keep doing so. 


The many topics we didn’t talk about. Many of the 
quantessential subjects we only touched upon superficially 
deserve chapters or books on their own. We spent a sec- 
tion on the miraculous properties of Carbon but what about 
a chapter on the virtues and technological blessings of sil- 
icon? What about the nano-sciences? What about an 


extensive review of an ever-growing list of alternative in- 
terpretations of quantum theory, like the ‘many-worlds in- 
terpretation’ proposed by the American physicist Hugh Ev- 
erett in his doctoral thesis at Princeton University in 1957? 
Indeed, there are many topics which are relevant that | 
chose not to focus on and only mentioned in passing. 


The main reason for these shortcomings is that | wanted 
to stay faithful to the subtitle of the book and focus on the 
Quantessence, the well-established fundamental aspects 
of the quantum reality. The perspective that shook the sci- 
entific world a century ago and lead to an unlimited exten- 
sion of technological opportunities and realities that has by 
far not been exhausted or even been fully explored. As | 
emphasized all along, the era of quantum information tech- 
nologies for example has only just started. 


Common denominators. 


The power of information as fundamental concept. Fig- 
ure III.5.2(a) is just like the figure we presented in the In- 
troduction to the book except that on the right we added 
a full column referring to the notion of information. It un- 
derscores that on all levels we may include an information 
science and computational perspective in the framework. 
All systems are in a sense information carriers and infor- 
mation processing devices, meaning that we set up paths 
with preset interactions between these carriers. Execution 
of a program or algorithm can be thought of as a partic- 
ular class of dynamical processes. In this book we have 
repeatedly noted that the information science perspective 
involving algorithmic thinking is in an interesting way com- 
plementary to the more conventional theoretical physics 
approach involving calculus and differential equations, and 
it has led to surprising insights. 


We encountered the notion of information towards the end 
of Chapter 1.1 while introducing the notion of entropy as the 
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(a) Human evolution as driven by the double helix of science and tech- 
nology. 


Figure IIl.5.3: The double helix of science and technology. 


logarithm of the number of micro-states corresponding to 
a given macro-state. It is a measure of information capac- 
ity of the system, or stated differently, for the information 
loss in going from the micro- to the macro-description of 
that system. It involves the aggregation of micro degrees 
of freedom into far fewer macro degrees of freedom. In 
that sense entropy is a measure for hidden information. In 
Chapter I.2 we gave a small introduction to the basics of in- 
formation theory as initiated by Turing and Shannon, and in 
the section on black holes we discussed the Bekenstein- 
Hawking entropy and the famous black hole information 
paradox. 


In the quantum realm, we introduced in Chapter Il.1 the 
idea of a ‘bit mechanics’ as the most basic of all dynam- 
ical systems leading to the notion of a qubit, with its two- 
dimensional Hilbert space. In the following chapters we 
illustrated many fundamental quantum concepts referring 
to this basic quantum system. In Chapter ll.4 we talked 
about teleportation of quantum information, about quan- 


(b) The positive feedback loop of science and technology producing 
knowledge, technology and the human expertise. 


tum gates and circuits, and went into a rather detailed dis- 
cussion of Shor’s quantum factorization algorithm. 


So indeed, the notion of information popped up everywhere 
justifying the blue column on the right-hand side of Fig- 
ure III.5.2(a). 


The power of symmetry as guiding principle. We saw 
that symmetry is a powerful notion with applications on all 
levels of the quantum ladder. This is reflected in the rich 
nomenclature involving symmetry concepts, like global ver- 
sus local (gauged), space-time versus internal, exact ver- 
sus approximate, and broken versus unbroken synmmetry. 
It is not surprising that the notion of symmetry popped up 
in many chapters. We decided to devote Chapter II.6 to the 
many ways symmetry concepts have entered physics. In 
a sense it also deserves just like information a full column 
in Figure III.5.2(b). 


Symmetries in classical as well as quantum physics are 
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linked to conserved quantities. Therefore they lead to a trans- 
parent labeling of the physical properties of states. It al- 
lows us to give names to things like ‘energy, ‘angular mo- 
mentum, ‘charge, or ‘isospin: 


Symmetry considerations play a crucial role in analysing 
and understanding the solution spaces of the fundamen- 
tal equations of quantum physics, like the spectra of sin- 
gle atoms and molecules as well as the states of many- 
body condensed matter systems. And symmetry served 
as a successful guiding principle in the uncovering of the 
underlying structure of subatomic physics encoded in the 
Standard Model as an expression of the underlying gauge 
symmetry. 


Symmetry-breaking turned out to be a key concept to ex- 
plain the many different guises in which symmetry mani- 
fests itself on all levels in nature. From Zeeman splitting 
on the atomic level to spontaneous magnetization or su- 
perconductivity on the macroscopic level, to the existence 
of the Higgs particle on the subnuclear level. Indeed the 
idea of symmetry-breaking led to a unified understanding 
of the phase structure predicted by a wide variety of theo- 
retical models. 


The power of modelling as a discourse. Most models 
are quantitative in nature and by construction logically con- 
sistent. An ever-expanding body of symbolic relations that 
may be used to represent anything you can imagine. A 
human-made symbolic language ideally suited for a truly 
scientific discourse. Many of the great scientific turning 
points are cast in simple mathematical equations, or math- 
ematically defined rules. 


State of the art modelling. Modelling is not only a way to 
talk about reality; it is also a way to talk with reality. It 
is a productive way of framing the scientific discourse. A 
state of the art model is rarely completely correct. It has its 
strong and illuminating sides but also its weaknesses. So 
especially once the systems become complex with many 


hidden feedback loops and many coupling parameters one 
doesn’t expect perfect predictions, and less so on the long- 
term future. What you gain in adaptability you lose in pre- 
dictability. Think of modelling the climate or the spreading 
of viruses like Covid-19 or Ebola, or the endless efforts to 
properly model the good old economy. 


The modelling activity furnishes a platform to study the ef- 
fect of possible interventions. This is an interactive plat- 
form that can bring opposing interest groups together in 
a reasonable debate or negotiation, assuming both share 
enough purpose. Playing with the parameters of models 
gives a clear impression of what might go wrong, what 
the vulnerabilities of the system are, and what type of tip- 
ping points can occur. Models thereby can forge the highly 
needed compromises in order to be able to deal with the 
problems one is faced with. 


Analytic versus algorithmic thinking. We have stressed 
that a crucial aspect of scientific progress is the parallel 
development of mathematics as a language for modelling 
nature. Nowadays we should also include the crucial im- 
portance of computation and algorithmic thinking as pow- 
erful means to achieve progress in science. This concerns 
a wide range of methodologies, beginning with simple nu- 
merical methods to solve systems of mathematical equa- 
tions to advanced simulation methods for complex systems 
like agent-based modelling. But also methodologies like 
machine learning to collect and analyse large data sets, al- 
gorithms to detect correlations, that make predictions pos- 
sible without an actual understanding of the causal mech- 
anisms underlying them. 


Rule-based models. In this era of computational empow- 
erment, we are increasingly driven away from completely 
analytical, closed systems of equations like those of New- 
ton or Maxwell, to more evolutionary approaches like sim- 
ple rule-based models. Rules that are iterated very, very 
many times and may lead to structural entities in which 
we may recognize fundamental aspects of reality. This ap- 
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(a) The structural hierarchy mapped onto a circle. 


(b) Three fundamental frontiers. 


Figure III.5.4: The structural hierarchy of the material world and the basic frontiers of science. In (a) we mapped the structural 
hierarchy onto a circle. Moving clockwise is moving towards larger scales, starting from 1072? and extending all the way to 10+2° 
meters. The human scale is kind of in the middle. In (b) we indicated the three fundamental frontiers. On the left the large-scale 
frontier of astronomy pursued through space observatories like the Hubble and the James Webb. On the right the small-scale frontier 
of high-energy physics pursued at CERN and Fermilab for example. The arrows pointing towards the bottom symbolize the multiple 
frontiers of the life sciences including neuroscience. These naturally expand into the vast domain of information and computer science 


that are redefining the range and ambitions of the social sciences including economics. 


proach involves a shift from analytic to more algorithmic 
thinking. 


A key feature is that simple algorithms can generate ex- 
tremely complex patterns with all kinds of emergent or- 
der. That emergent order is very hard to predict in ad- 
vance using tools from standard analysis and geometry; 
its complexity can only be understood from actually run- 
ning the algorithm for a sufficiently long time. We speak 
of irreducible complexity inherent to certain simple rule- 
based dynamical systems: for example cellular automata, 
or evolutionary pattern growth algorithms on networks, like 
John Conway’s Game of life. The simplest way to find out 
what the structures are that emerge from a certain rule is 
to run the corresponding program long enough. We refer 


to the extensive literature on this subject by its pioneer and 
protagonist Stephen Wolfram who is also the founder and 
CEO of the successful software environment called Math- 
ematica and Wolfram language. In his latest project aimed 
at ‘finding a new fundamental theory of physics’ he argues 
that all of quantum may be the product of iterating a simple 
rule-based algorithm! Another great mission, but for now 
also incomplete. 


Scenarios for past and future 


Science at large. In this final section | would like to put 
the whole quantum story in the wider context of science 
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in general, a perspective that derives also from my earlier 
book titled In Praise of Science: Curiosity, Understanding 
and Progress. And in doing so | have adapted some of the 
imagery created for that book. 


To me one of the most remarkable facts we are aware of is 
that nature evolved from a random and structureless initial 
state with a uniformly distributed low information density, to 
a state of very high information content, very much local- 
ized in the most advanced of biological organisms such as 
human beings. It did so by following a set of strict rules we 
call ‘laws of nature. The most stupefying twist is that these 
rules have been hidden until we as human beings became 
aware of them after millennia of carefully researching and 
modelling what we observed. Indeed, nature seems to be 
in search of itself, becoming aware of itself through this 
concerted yet indefinite human effort. 


The double helix of science and technology. 


Let us focus a bit more on the mechanism underlying this 
process of progress as depicted in Figure Ill.5.3, On the 
left we see a schematic of what | have called ‘the double 
helix of science and technology. It is like a mutually inspir- 
ing, almost ritual dance, generating knowledge and tech- 
nology, but also the expertise of scientists and engineers 
who are able to create and apply that knowledge. Para- 
phrasing Francis Bacon it visualizes the idea that ‘wonder 
is the seed of knowledge’ and ‘knowledge is the seed of 
technology, which in turn is the seed of new ‘wonder’ and 
scientific discovery. 


This perpetual machine works because technology also in- 
volves the invention of new instruments that shift the bound- 
aries of what is observable. It pushes the observable in 
an objective sense. The domain of empirical investiga- 
tion keeps expanding, generating an ever-growing body of 
knowledge! From instruments like microscopes and tele- 


scopes, all the way up to MRI machines, accelerators, and 
not to forget computers. The power to compute, to sim- 
ulate numerically, as well as screening immense quanti- 
ties of data for all kinds of correlations and patterns which 
are hidden from the human eye, is invaluable for human 
progress. 


This human-made evolutionary process overtakes biolog- 
ical evolution in the sense that it continuously offers new 
options to humanity to move forward. | use the term op- 
tions on purpose because it implies the notion of choice. 
The term progress suggests that society will always ben- 
efit, but that is not necessarily the case. What is certain, 
however, is that society will keep being bombarded with 
ethical and moral dilemmas, because those are inherent 
to that double helix of innovation. 


History has taught us that technology is a double-sided 
sword which may be used in constructive as well as de- 
structive ways. And that means that it requires a society 
that has the ability to make the right choices and in partic- 
ular manages to avoid a proliferation of the evil aspects of 
technological achievement. | think there is ample room for 
optimism but to close one’s eyes for the risks and the dark 
sides that are certainly there, is dangerously naive. 


Looking at the double helix of Figure III.5.3(b) one realizes 
that it is a magical machine that is not easy to stop. It is 
a positive feedback loop. It is hard to forbid curiosity or 
creativity by law but there have been regimes that did ex- 
actly that, a game only with losers. This machine is much 
more autonomous than most people are aware of. It takes 
a great deal of expertise and scientific awareness to nav- 
igate society in a way that the constructive opportunities 
get amplified and the destructive ones are eliminated as 
far as possible. It is quite evident that good science does 
not work by popular vote. The scientific method is open 
to critique and rigorous analysis, but it is not democratic 
in the ‘one man one vote’ sense. That does not preclude 
that by the time new technological options present them- 
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selves to society one may hope that well-informed crowds 
will demonstrate their wisdom in governing their implemen- 
tation. 


This observation once more underscores the importance 
of fighting scientific illiteracy through broad educational pro- 
grams introducing science and technology and raising the 
awareness of the social impact they may have. It is our 
duty to educate a critical audience, that is conversant about 
topics that will shape our common future. In my opinion 
those topics include the possible ways in which we may 
steer and regulate future applications of science and tech- 
nology so that they improve the human condition not for 
the few but for the many. 


What adds to the complexity of this process is the fact that 
the plusses and minuses of novel technologies are in most 
cases not evident at the moment of their inception. Un- 
fortunately they are often even intertwined. And that is 
precisely why the incorporation of up-to-date scientific ex- 
pertise in the political arena is necessary in any well-funct- 
ioning, future oriented democracy. 


Trees of knowledge 


What we learned in this process of scientific discovery is 
presented schematically in a series of four subsequent im- 
ages. You may call it a display of the harvest of the double 
helix. 


The structural hierarchy. In the first picture III.5.4(a) we 
mapped the structural hierarchy of the material world onto 
a circle, where moving clockwise we go to ever larger dis- 
tances. At the bottom, roughly in the middle, we see our- 
selves, and it is from that position that we started to ex- 
plore the order of things in- and outside of us, diving ever 
deeper in the microcosmos and looking ever further out in 
the macrocosmos. So one way to look at this figure is that 


it depicts the human effort to understand the world we are 
living in, basically following the double helix of science and 
technology. 


Three fundamental frontiers. The arrows we superposed 
on the circle in the second picture III.5.4(b) indicate how 
the basic frontiers of knowledge have moved forward. On 
the left from starting with Galileo all the way up to the Hub- 
ble or Webb space telescopes, and on the right from An- 
tonie van Leeuwenhoek all the way down to the LHC at 
CERN. Very large and very small scales meet and merge 
in the Big Bang where modern research fields like astro- 
particle physics came to flourish. The Big Bang is the 
event where today’s largest and smallest scales of the uni- 
verse meet and that is why | have put the scales on a circle 
and not on a line. 


The inside arrows pointing down to us humans clearly rep- 
resent the evolutionary perspective on structural complex- 
ity like the phenomena of life. The arrow on the left rep- 
resents the study of biology from the macroscopic Dar- 
winian perspective on the speciation of plants and animals, 
and on the story told by the fossils they left behind in the 
earth’s crust. The downward arrow on the right represents 
the unstoppable advance of molecular thinking in the life 
sciences, symbolized by the DNA-molecule. And indeed 
the genes on the DNA molecules tell that same Darwinian 
story but then on the molecular level. These two comple- 
mentary views on evolution therefore meet and merge in 
the modern life and the earth sciences. And in a sense 
this ‘closes’ the circle at the bottom in us humans. 


Three domains: Relativity, Quantum and Evolution. As 
indicated in the third picture III.5.5(a), the arrows in the 
background represent the large domains of fundamental 
scientific inquiry which are anchored in the leading con- 
ceptual frameworks like the domain of relativity (concern- 
ing space-time and gravity), the domain of quantum (cov- 
ering all forms of constituent matter and the forces be- 
tween them), and finally the domain of evolution, the con- 
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(b) Turning points in our understanding. 


Figure III.5.5: The structural hierarchy unravelled by the sciences. On the left in the circle it is basically the gravitational force that 
causes structure, while on the right it is due to the other forces. Top down we see basically how time evolution both from the left (from 
large scales down) and the right (from small scales up) lead to ever more complex structures. The turning points in our understanding 


of nature can also be mapped on the circle. 


certed effort to gain a unified understanding of the tremen- 
dous diversity and complexity that evolved in nature over 
time. 


Quantum versus Relativity. Quantum theory is less ac- 
cessible than relativity, because as we saw it is the im- 
pressive legacy of a great number of outstanding scien- 
tists that filled over a century of successful groundbreaking 
research. For that reason quantum has not been person- 
alized to the degree that relativity has been identified with 
the person of Albert Einstein, and maybe that also explains 
why intellectual giants like Bohr, Schrödinger, Heisenberg 
and Dirac never reached the status of a public idol like Ein- 
stein. The painful paradox is that whereas their profound 
work is leaving ever deeper marks in modern life, most 
people bitterly complain that they do not understand a sin- 
gle word of it. And that was one more reason to write these 
books. 


It is interesting to note that a Nobel prize for the theory 
of relativity as such has never been awarded, while there 
have been more than fifty linked to quantum theory as 
witnessed by the tables in appendix B on ‘Chronologies, 
ideas and people. Indeed, the prize awarded to Einstein, 
was in recognition of his explanation of the photo-electric 
effect, which is a fundamental contribution to quantum the- 
ory and has nothing to do with relativity. So the irony is that 
he received the Nobel prize for his contribution to a theory 
he basically didn’t believe in! 


With so many Nobel prizes awarded, it is no surprise that 
a book that aims slightly higher than just summing up the 
basic results is bound to be voluminous indeed. Be my 
guest! 


Turning points. In the fourth Figure III.5.5(b) we show 
how this endeavor to advance knowledge gave rise to a 
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(a) The Big Bang and the subsequent cosmic evolution. 
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(b) Ultimate questions that concern our deep origins and our long-term 
future (if we have one). 


Figure III.5.6: Science appears caught between two singularities. The cosmic evolution at large according to the Big Bang scenario 
is depicted in Figure (a). The ultimate questions in Figure (b) concern on the one hand the origin somehow hidden in the Big Bang, 
and on the other is about where this evolution will bring us and to what extent we can shape that future ourselves. So it concerns 
nothing less than the quest for the interpretation or meaning of our universe as a whole, and its present and possible future contents. 


rather limited number of truly fundamental turning points 
that stand for the great leaps forward in our scientific un- 
derstanding of the natural world, the world we ourselves 
are part of. It is striking to see that there are only so few. 
It is also striking that so much novel science and technol- 
ogy derives from such a small number of truly fundamental 
insights. 


Cosmic evolution. Let us continue with the two pictures 
of Figure III.5.6. In the first one we depict the actual pro- 
cess of cosmic evolution according to the hot Big Bang 
scenario. Where the increasing complexity in dead matter 
smoothly turns into the Darwinian story of life. This took 
altogether almost 14 billion years, where the Darwinian 
episode ‘only’ covers the last 4.5 billion years. Clearly the 
full story is by no means complete. The figure nicely shows 
how material complexity sequentially evolved as a neces- 


sary consequence of an expanding universe slowly cooling 
down. It is this story of cosmic evolution that brought most 
of the empirical natural sciences together so harmoniously, 
that makes the narrative or perspective of science on the 
whole of nature so clarifying and illuminating. It is in that 
story that reductionism meets holism. A beautiful product 
of brainpower, enlightenment and perseverance. 


Ultimate questions: from origin to fate. Science is a 
systematic process of advancing understanding by creat- 
ing ever better observational abilities, which in turn allow 
for ever better modelling of reality. The circle that appears 
in all the figures by no means tries to convey the idea that 
science is a closed body of knowledge, a narrative com- 
pleted. Science is always ‘work in progress, and may on 
the one hand be characterized by the questions it did an- 
swer, but on the other hand by the questions it raised but 
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did not answer. This is indicated by the question marks at 
the top and bottom of the would-be circle. They represent 
ultimate questions that in fact rip open the circle allowing 
for additional realities we have not yet any idea about. It 
illustrates how the whole of science is basically caught in 
between two essential but enigmatic singularities. 


On top we have what | called the ‘cosmic short’ between 
the physics of the smallest and largest conceivable scales 
which somehow meet in the Big Bang. We like to think 
of the Big Bang as an event, but may be it is better to 
think of it as a gate to an unknown territory where relativity 
and quantum presumably govern in a truly unified fashion. 
In that point there is room for fundamentally new insights. 
That gate would give access to the physical origins of the 
Big Bang itself. Our lack of understanding is probably best 
characterized by the term ‘Big Bang singularity, which of 
course refers to the unphysical extrapolation of the early 
universe to the quite unphysical initial state with an infinite 
temperature and energy density. 


The arrows of time move downward towards the domain 
of human evolution, of the human brain, and of human so- 
ciety. Clearly also at that point our understanding is very 
much incomplete. The present state of science poses hard 
questions, like asking how the process of evolution will fur- 
ther unfold. It is a fact that the theory of evolution, in spite 
of having an incredible explanatory power with respect to 
our past, is surprisingly weak as a predictive model. It 
predicts a process of the increasing complexity of organ- 
isms but is not specific about where the breakthroughs 
of — let us call it — biological self-transcendence will take 
place. And this question of predictability has not become 
easier as we humans have become the dominant species 
on Earth. As indicated in the figure we have moved from 
an initial state, which is characterized by extremely high 
energy, chaos, a uniform distribution of a low information 
content or capacity, towards the present state which has 
the signature of very low temperature and energy, allow- 
ing for highly localized forms of complex order and high 


information capacity like the brains of human beings for 
example. 


Evolution at large. In Figure III.5.7 | have presented an al- 
ternative visualization of the cosmic evolution at large, and 
marked the most consequential branchings of the evolu- 
tionary tree. | like to think of these branchings as moments 
of radical innovation, as irreversible transitions or tipping 
points. Indeed we went through the evolution of dead mat- 
ter all the way up to the production of the chemical ele- 
ments which were a necessary prerequisite for the cre- 
ation of sustainable life on Earth and may be elsewhere on 
what are called exoplanets. In a universe with some 10?! 
stars that probability of extraterrestial life can’t be negligi- 
ble | would think. 


In order to cope with the unknowns of the future a solid 
knowledge of our past appears to be a crucial prerequisite. 
So we should celebrate collaboration in scientific research 
efforts addressing such questions, like the launch and op- 
eration of the James Webb space telescope that allows us 
to look deeper in the universe than we ever did before, ex- 
actly to better understand its remote past. It is a splendid 
international collaboration of NASA and the European and 
Canadian Space Agencies. Its mission is to collect hard 
data concerning the beginning of structure formation and 
the births of stars as well as the possibility of extra terres- 
trial life (see Figures III.5.8 and III.5.9). 


Once life began we had another 4.5 billion years of biolog- 
ical evolution culminating in such attributes as conscious- 
ness and intelligence which allowed humanity to basically 
take over their planet. Human evolution transformed us 
from just inhabitants to the custodians of planet earth. It 
appears that we have taken our fate in our own hands. We 
have become responsible for our own future. At present 
that means that we have to face such inconvenient truths 
like the climate crisis and we need to urgently act in order 
to keep the planet inhabitable. Al Gore, the former vice 
president of the US and a powerful voice in favor of direct 
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Figure III.5.7: Cosmic evolution at large. Does human evolu- 
tion, driven by the double helix of science and technology, allow 
for a post-biological branch dominated by artificial intelligence, 
machine learning and quantum computing? 


action to avoid climate catastrophies, once noted that by 
broadcasting an inconvenient truth one is bound to wake 
up the most powerful enemies, which makes taking proper 
action even harder. 


We also have to seriously analyse the consequences of 
the great information revolutions that obey Moore’s law, 
and the introduction of internet and its radically novel way 
of ‘connecting people. The introduction of these new tech- 
nologies that allow for instantaneous and global human in- 
teraction clearly implies a fundamental change in the hu- 
man condition which has caused a tipping point in social 
awareness and coherence. It started a process of a global 
restratification of society, and the unfolding of unheard of 
concentrations of power and wealth. This process is full 
of social risks and has to be critically monitored and con- 
trolled by governments and international institutions that 
should be endowed with both sufficient funding and exec- 
utive power. This is a far cry from today’s reality. 


To cope with the many negative aspects of these develop- 
ments requires the development of the notion of global citi- 
zenship. People should be educated to be aware of what is 
happening, and institutions should insist on openness, ac- 
cessibility and transparency. This may necessitate adding 
new chapters to the declarations of fundamental human 
rights, which extend and define these rights to their exis- 
tence on the World Wide Web and other cyberworlds. It 
teaches us, as the dominant inhabitants of planet Earth, 
that the tremendous amount of freedom we have achieved 
implies a huge undeniable responsibility. 


A post-biological branch? Information philosophers and 
futurists like Max Tegmark, Nick Bostrom and Yuval Harari 
warn us that with the rapid advances in artificial intelli- 
gence, like machine learning, and quantum computing, 
machines may well take over completely as we become 
more and more dependent on them. Not just for gathering 
relevant information, but also for making rational, optimal 
decisions. There are major obstacles to be taken, namely 
to extend the abilities of artificial intelligent algorithms to 
have ‘general intelligence.’ This is a much harder problem 
than acquiring expertise in a limited context and domain 
in which algorithms already outperform humans. General 
intelligence is the outcome of our biological evolution and 
unsurprisingly, that is what humans excel in. 


Anyway, the question posed by the orange branch in the 
figure is whether we are on the verge of a transition to- 
wards a radically different post-human, post-biological evo- 
lutionary phase. This does not mean that we could no 
longer exist, bacteria after all managed to survive in many 
ways too well for billions of years after more complex or- 
ganisms took over. What the post-human branch presum- 
ably implies is that we are no longer the glamour boys of 
creation, but rather that we may turn into somewhat out- 
dated pieces of biological apparatus of reduced relevance, 
compared to our super intelligent silicon or quantum broth- 
ers and sisters to be. Maybe the optimal way forward is to 
engage in further exploring symbiotic options. 
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The intrinsic value of science. We should be aware that 
politicizing science is a threat to its primary objective: the 
search for objective truths. The risk of trying to politicize 
that aspect is not just that it leads to crimes against logic, 
but also to corrupting scientific integrity. It often involves 
a form of ‘passive lying, which refers not to directly telling 
plain lies (active lying) but rather to not telling the truth, that 
is, the whole truth. It is like leaving important terms out of 
the equation and thus propagating models that fail reality. 
It is like the often applied strategy of spreading misinfor- 
mation to gain political or commercial support and influ- 
ence. ‘The goal justifies the means, is the slogan that eas- 
ily comes along and allows the most well-funded lobbyists 
to dominate the political landscape. Indeed, the success 
of advertising is justifying the goal of better sales often by 
not telling the truth. 


But doesn’t science do the same, you might object? Yes 
and no! It is certainly true as | have noted repeatedly in 
the book that science is ‘work in progress, and therefore 
also scientific ‘truths’ are relative and should be subject to 
refutation if decisive arguments or data are being brought 
forward at some point. Indeed the notion of an absolute 
truth is basically incompatible with the notion science as 
an incomplete body of knowledge. And it is this aspect 
that makes the scientific infrastructure, its institutions and 
funding strategies vulnerable to abuse. This is a paradoxi- 
cal aspect of the role that science plays in society: in spite 
of the fact that there is no such thing as an absolute truth, 
we do not hesitate to board planes, go to hospitals, and 
get addicted to our cell phones. It appears that scientific 
truths, if not absolute, are at least extremely robust! 


The symbiotic relationship between science and technol- 
ogy is harder to disentangle. As stated before they need 
each other in essential ways, and yet technology is per 
definition a double-sided sword. The best we can do is 
to insist that the discourse on science and technology at 
all stages be a hundred percent transparent and respects 
the principles of a solid democracy. This refers to a higher 


Figure III.5.8: An artist impression of the James Webb space 
telescope (JWST) unfolding in space at a distance of 600.000 
km from the Earth. Its mission is to look at the very early stages 
of the universe as a whole and the very early stages of structure 
formation. It is furthermore the first space telescope to study the 
possibility of extraterrestrial life by analysing the chemical com- 
position of the atmosphere of exoplanets. The slogan would be: 
Are there somewhere in the universe alternative humankinds? 
(Source: Adriana Manrique Gutierrez / NASA) 


vocation, and adds elements of ideology and wishful think- 
ing to the notions of science, technology and innovation, 
which in turn make them more vulnerable! 


In my opinion what we need is quite the opposite of what is 
trending: we need to have more science, scientific literacy 
and expertise into the political arena to bring the neces- 
sary amount of integrity into the political discourse. Unfor- 
tunately science as the evidence-based cornerstone of hu- 
man culture remains a vulnerable institution that should be 
protected and defended against the arrogance of power, 
media popularity, the spreading of misinformation, and lob- 
bying practices that turn into corruption. In the words of the 
well-known spy novel author John le Carré: 


One day somebody will explain to me why it is that, 
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at a time when science has never been wiser, or 
the truth more stark, or human knowledge more 
available, populists and liars are in such pressing 
demand. 

John le Carré 


Indeed, as soon as we allow the politicization of the fund- 
ing structure, and make it a prey to lobbyists and com- 
mercial interests, or force it to serve the vested interests 
and privileges of some ruling class, irrespective of the po- 
litical system we adhere to, we are sure to lose science. 
It will decay from a devoted search for truth — also if that 
truth turns out to be inconvenient — to some kind of hidden 
or even blatant form of lopsided advocacy. Legal experts 
or lawyers, in contrast to scientists, are allowed to limit 
their sources for research and part of their skill requires 
craftfully selecting the evidence that supports their client’s 
case. 


Here is a large-scale perspective offered by the eminent 
quantum scientist Charles Bennett: 


The Enlightenment inspired Universal Declaration 
of Human Rights promulgated in 1948 after a decade 
of technical sophistication accompanied by inequity 
and cruelty on an unprecedented scale, exempli- 
fies the seemingly still attainable goal of an equi- 
table, peaceful society that manages its environ- 
ment and itself well enough to last millions of years. 
Charles H. Bennett 


Human history looks like a perpetual battle between power 
and knowledge, with power always calling victory in the 
short term (under the argument of improving efficiency and 
‘the’ economy) and knowledge always being the winner in 
the long term, even though the price for society for finding 
out can be disproportionally high. We have created dan- 


gerously pervasive constructs like the military-industrial com- 


plex, or the medical-industrial complex and now also the 
information-industrial complex, which have turned into au- 


tonomous self-inflating entities thoroughly intertwined with 
human society. These thrive on a delicate interplay be- 
tween innovation and commercialism using the creation of 
fake needs and fake fear. They embody an abuse of power 
that is derived from knowledge. The sobering fact is that 
lies and misleading accounts spread fast and one can only 
hope that truth will ultimately prevail. | myself firmly believe 
that to be the case, but over all it remains an open ques- 
tion. Too much science/technology based power in too few 
hands is a recipe for societal disasters. Let me close with 
quoting Bennett once more: 


Unfortunately due largely to the increased range 
and speed of communication, misinformation has 
emerged as a meta-threat to equity and civilisation. 
By luring people into self-isolating bubbles, to be 
soothed, entertained and incited by incompatible 
versions of reality, it empowers autocrats and dem- 
agogues, it hobbles democracies and makes co- 
operation on globally urgent problems like climate 
change almost impossible. 

Charles H. Bennett 


Addressing scientific illiteracy. 


Heisenberg? Huh, isn’t that the guy from Breaking 
Bad? 


After the red light started flashing, the radio host nodded 
to me and asked: ‘Well, professor, can you tell us in a few 
lines what quantum physics is?’ And | said: ‘ Hm, yes of 
course, hmm | mean No! Hmmm, | mean yes, but ..” Talk- 
ing quantum to family and friends at a birthday party often 
feels like being a tour guide in London for extra-terrestrials 
who don’t happen to know what a bridge, a museum or a 
traffic light is. As | mentioned before, the fact that quan- 
tum things are to a large extent invisible does not mean 
that they are not there. They certainly are. And as we 
have learned, the fact that most quantum things are not 
discernible by the naked eye doesn’t mean that they are 
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Figure III.5.9: Starbirths in the Carina Nebula as seen by the James Webb space telescope. This image made in July 2022 is divided 
horizontally by an undulating line between a cloudscape forming a nebula along the bottom portion and a comparatively clear upper 
portion. Speckled across both portions is a starfield, showing innumerable stars of many sizes. The smallest of these are small, 
distant, and faint points of light. The largest of these appear larger, closer, brighter, and more fully resolved. The upper portion of 
the image is blueish, and has wispy translucent cloud-like streaks rising from the nebula below. The cloud-like structure of the nebula 
contains ridges, peaks, and valleys - an appearance very similar to a mountain range. (Source: NASA, ESA, CSA, and STScI.) 


not relevant or important. In spite of being unknown and 
widely ignored, the quantessentials are here to stay. This 
leaves us with the sobering fact that they are still surpris- 
ingly unfamiliar. This in my opinion is a strong call for 
worldwide efforts to educate, to fully develop the tremen- 
dous intellectual potential that is present everywhere at 
any instant. 


| have spent about half a century in that invisible quan- 
tum world, doing a lot of active research, but also getting 


slightly frustrated not being able to share much of it at ev- 
eryday occasions like birthday parties. That made me sad 
but also aware that | should stop whining and just sit down 
and write a book about what | learned on my journeys 
through that amazing quantum world. A modest attempt to 
help alleviate the burden of scientific illiteracy. And that is 
how the three lines allowed to me by that sympathetic inter- 
viewer gave rise to these three volumes about the Power 
of the Invisible: The Quantessence of Reality. 
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Further reading. 
Classics of popular physics: 


— Cosmos 
Carl Sagan 
Random House (1980) 


— A Brief History of Time 
Stephen Hawking 
Bantam Dell Publishing Group (1988) 


— Cosmic Code 
Heins Pagels 
Dover Publications (2012) 


On Science and the Future of Human Culture: 


— Superintelligence: Paths, Dangers, Strategies 
Nick Bostrom 
Oxford University Press (2016) 


— Sapiens 
Yuval Harari 
Penguin books (2015) 


Complementary reading: 


— A Project to Find the Fundamental Theory of 
Physics 
Stephen Wolfram 
Wolfram Media (2020) 


In Praise of Science: Curiosity, Understanding, 
and Progress 

Sander Bais 

MIT Press (2010) 


Mysteries Of The Quantum Universe 
Thibault Damour and Mathieu Burniat 
Penguin (2020) 


Appendix A 


Math Excursions 


&% On functions, derivatives and integrals 


Do not worry about your difficulties in mathematics. 
| can assure you that mine are greater still. 
Albert Einstein 


Functions. Functions are a general class of objects in 
mathematics that have endless applications in all fields of 
science. A function is an object — let us denote it by the 
symbol f — that may depend on a set of variables (argu- 
ments) — say {xq}. As such it assigns a value to f for any 
allowed point in the space of variables V ~ {xq} : in other 
words it provides us with a map f : ¥ => F . The domain F 
of the function denotes the space where f itself lives, and 
can be many things, we think in particular of the real num- 
bers R, the complex numbers C, or some (other) vector 
space V.! 


Think of the temperature T in the room you are in. It is 
a function that depends on where and when, i.e. on the 
set of variables V ~ {x,t}, you could say T : {x,t} => R 
and we indicate this dependence by writing T = T(x,t). 
The potential energy V(x) of a particle is a real function 
defined over the real position space, and like the tempera- 


1We mention the words ‘complex numbers’ and ‘vectors’ here just in 
passing; these notions are discussed in later Math Excursions. 


Smooth, C°? 
(Infinitely) differentiable 


Figure A.1: Function classes. We have plotted three functions 
which belong to different classes. A discontinuous function on 
top (the function value jumps at x = xo). In the middle a con- 
tinuous function but not-differentiable at x = x1,x2 and x = x3, 
where the slope is discontinuous when approaching the points 
from the left and the right. At the bottom a smooth function 
which is per definition infinitely differentiable, meaning that all 
higher derivatives exist and are continuous. 


ture, V may differ from place to place. If we plot the value 
of a real function f as the ‘height’ above the point x then 
f(x) defines a kind of landscape over æ . Very basic fea- 
tures of functions are given in Figure A.1 which refer to 
whether they are continuous and or differentiable. We will 
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mostly assume that we are dealing with smooth functions: 
those are functions for which all derivatives exist and are 
continuous. 


We have mentioned other quantities which are basically 
functions: the position and velocity are functions of the 
time variable. In d-dimensions these are vectors ("vec- 
tor" functions) with d components. Vectors have not only a 
magnitude but also a direction which makes them different 
from being just a number. A number can be written down 
and be communicated by mail; this is not true for a vector 
because the direction can get messed up. The electric and 
magnetic fields are both vector-valued functions or vector 
fields in short. The same is true for the velocity field of a 
river, it encodes the direction in which the fluid flows at any 
given point in the fluid. So even if you were not aware of 
the notion of (vector) functions, you presumably now re- 
alize that you are quite familiar with them. To give you 
an impression, we have plotted some typical elementary 
(real) functions of a single variable in Figure A.2. 


With real functions you can do what you can do with num- 
bers if you do it point wise, i.e. in every point of ¥ . For 
example, we define the product h of functions f and g by 
the function h(x) = f(x)g(x). The limitations on what 
you do with functions is of course determined by which op- 
erations are defined in F. 


Of interest are two natural operations one may define on 
smooth functions that play a fundamental role in many ap- 
plications. These operations are basically each other’s ‘in- 
verse’; one is called differentiation or taking a derivative, 
the other is integration, or taking the integral. We discuss 
them for the case of real functions. 


Differentiation. Think of a real function f(x) of one real 
variable, then we may draw it as a curve on a graph paper, 
putting x along the x-axis and f(x) along the y-axis, as 
we did in Figure A.3(a). The derivative with respect to the 


! df 
variable x in a point xo of the function denoted as — , or 


simply with a prime, i.e. f’(xo) is just the slope of that curve 
above the point xo . 


For example, if the function is linear in x , f(x) = 3x, then 
that function has a constant slope equal to 3 and thus is the 
derivative a constant, f’(x) = 3. Having given this heuris- 
tic definition of the derivative, | should hasten to say that 
this is a phenomenally important concept in science, as it 
embodies the mathematical statement that exactly quanti- 
fies the otherwise rather vague notion of ‘change’. 


Looking at the derivative operator more abstractly it can 


be considered as a map aa : F — Slope F . Points where 
the derivative of a function vanishes correspond to points 
where the slope is zero and the function has a maximum 
or a minimum, as we have indicated in Figure A.3(a). Note 
that if one knows a function in the neighborhood of a point 
xo One may calculate its derivative in that point. This is 
clear from the formal definition of the derivative: f'(x) = 
(f(x + Ax) — f(x))/Ax taken in the limit of ever smaller 
Ax. This definition implies another useful relation (also 
in the small Ax limit) namely that we may write: f(x + 
Ax) = f(x) + f’(x)Ax. This provides a clear statement 
of the use and meaning of a derivative: if we make a tiny 
move from x to x + Ax in space, then the corresponding 
change in any function f(x) , is from f(x) to f(x) + f’(x)Ax 
to lowest order in Ax. 


Let us finally mention that calculating the derivatives of 
many standard functions and expressions containing them 
is not so hard and usually part of a science high school 
math curriculum. We have listed a few derivatives of stan- 
dard functions in Table A.1 below. Another way to think 
about differentiation is therefore to say that it is an operator 
d/dx which applied to a function f(x) generates a trans- 
lation (or change) in function space F induced by a small 
translation in the underlying configuration space 7”. We 
will make use of this interpretation later on. 
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(a) The linear function f(x) = x. It has a con- (b) The quadratic function f(x) = x7. It has 


(c) The inverse function f(x) = 1/x. The func- 


stant slope. It is the simplest odd function as it a constant curvature or second derivative. It is tion slowly tends to zero for x — oo, while it 


satisfies f(—x) = — f(x) 


the simplest even function (except the constant becomes infinite (or singular) for x — +0. It is 
function) satisfying f(—x) = f(x). 


only defined for x £0. 


RATEN) 
ae RES, heed b 


Si) ere 


(d) The periodic function f(x) = cos(x). It sat- (e) The exponential functions f(x) = e? 


CX 


(f) The logarithmic function f(x) = ln(x) is a 


isfies the property f(x) = f(x+27) . Shifting the these grow rapidly to co for x —» +oo and de- slowly but ever-growing function. It has a singu- 


cosine by 1/4 period to the right one obtains the cay rapidly to zero for x — Fo. 


sine function. 


larity for x — +0. 


Figure A.2: The graphs for some typical elementary real functions f(x), showing their salient features. 


An example: dispersion. We have been discussing the 
energy E of a particle as a function of the momentum p for 
the non-relativistic and relativistic cases with a paramet- 
ric dependence on the mass mo. There is another quan- 
tity of importance and that is the dispersion defined as the 
derivative of E with respect to p . The term dispersion orig- 
inates in optics where in a given medium one has that the 
frequency will depend on the wavelength, which manifests 
itself for example in the fact that the angle of refraction 
of light will depend on the angle of the incident beam. 


For matter waves we have that E = hw and p = hk, 
so we can express the dispersion also in terms of E and 
p. In Figure A.4, | have plotted the relativistic expression 


for the particle energy, E = ,/mic*+ p2c?, and below 
it the dispersion dE/dp = pc/,/méc? + p2. There are 
roughly three regimes: (i) on the left we have the non-- 
relativistic regime where p < moc where the energy ap- 
proximates tp E ~ moc? + p?/2mo with linear disper- 
sion dE/dp ~ p/mo, and the expression up to the mass- 
energy reduces to the familiar Newtonian form, (ii) in the 
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(a) The derivative df/dx (purple) of a function f(x) (red). At the extrema 
of y(x) the derivative (= slope) is zero. 


(b) The integral ie f(x) dx of f(x) is the area below y(x) above the x- 
axis minus the area belowthe x-axis, between the points x = a and 
w=. 


Figure A.3: A function, its derivative, and its integral. 


middle we need the fully relativistic expression, and (iii) on 
the right we have the ultra-relativistic regime where p > 
moc, and we have the approximation E ~ pce with disper- 
sion dE/dp ~ c = constant, which effectively corre- 
sponds to the expression for a massless particle. 


| 
alaia cat cs is bell a [Ll al 


Figure A.4: Relativistic energy. The particle energy E as a 
function of p in red, and the dispersion defined as the derivative 
dE/dp in purple. We have chosen mo and c equal one. 


Integration. Having the red curve in the example of Figure 
A.3(b) the (definite) integral Fa» of a function f(x) between 
two points x = a and x = bis just the area under the curve 
between the two points. One may also define an ‘indefinite’ 
integral F(x) or primitive of f(x) , which is mathematically 
represented by the integral symbol: 


(A.1) 


F(x) has the property that Fa, = F(b) — F(a). If a func- 
tion is constant f(x) = c then the integral is thus simply 
Fab = c(b— a) and F(x) would be F(x) = cx + d where 
there is an arbitrary constant d that one can add. Now we 
are also in a position to appreciate the remark that these 
operations are in a sense each other's inverse: if we differ- 
entiate F(x) we get the original function f(x) back. 


The definition of the integral involves a limiting procedure 
of an approximation that is not so hard to imagine. To cal- 
culate the definite integral Fa, , we divide up the interval 
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Figure A.5: The integral as area. The definition of the definite 
integral of F(x) between points x = a and b is the sum of the 
positive and negative contributions from the areas of the small 
rectangles, in the limit that Ax — 0. 


b — a on the x-axis up in a large number N equal little 
segments Ax, then we define the centre of each segment 
by its coordinate x; : i = 1,...,N. The integral is then 
defined by: 


b N 
E =| f(x)dx = lim a (A.2) 


as is illustrated in Figure A.5. 


Calculating the integrals of elementary functions is not too 
hard, but often integrating is hard and not possible in ‘clos- 
ed form’. Therefore numerical approximations are of cru- 
cial importance in most applications, and those are usually 
based on approximations in the spirit of equation A.2. The 
problem of integration is at the heart of physics and engi- 
neering, exactly because in most cases the laws that gov- 
ern nature are formulated as so-called differential equa- 
tions, that means that the equations contain derivatives of 
quantities one would like to solve for. Many equations are 
‘equations of motion’. The equations of Newton determine 


derivative : function : integral : 
df 
n (Gg) En) = [tax 
dx 
a ax I ax? 
1 
n—1 n =j n+1 
in x™(n 4-1) a 
—] 1 
2 = tnx] 
cos(x) sin(x) — cos(x) 
— sin(x) cos(x) sin(x) 
1 
kek ekx : ek 
1 
= Inx xInx —x 
x 
2> —— Integration 
differentiation — — 


Table A.1: A list of some elementary functions (see also Fig- 
ure A.2) in the middle column, with their derivatives on the left 
and their integrals or primitives on the right. Taking a derivative 
moves you to the left, integrating moves you to the right. Inte- 
gration means that one always can add an arbitrary constant to 
the integral; this constant is not included in the table. 


the time evolution of a particle’s position end momentum. 
The Maxwell do that for the electromagnetic fields, and 
the Schrödinger for the wavefunction of a quantum sys- 
tem, while Einstein’s equations describe the time evolution 
of the universe. Solving those equations corresponds in 
some sense to finding ways to ‘integrate’ the equations for 
specific boundary or initial conditions. 


In Table A.1 we have listed some well-known functions, 
their derivatives, and their primitives (i.e. integrals). 
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Example: the harmonic oscillator. The force on a par- 
ticle is defined as minus the derivative of the potential en- 
ergy: F = —dV/dx , indeed with V = ax*/2, this yields the 
harmonic force F = —ax. But given the force we can also 
calculate the potential energy by integrating it. We have to 
move the particle up the hill form x = 0 to, say, x = xo. 
To do so we have to do an amount of work on the parti- 
cle which equals the (opposite) force times the distance, 
integrated from zero to xo: 


XO XO 1 x, 1 
v=-| F(x) ax =| ax dx = {5ax° bo = 79%. 
0 0 


Differential equations. Differential calculus is basically 
the calculus of changes, and differential equations are typi- 
cally the equations that govern the change in time or space 
of any dynamical system one might think of, equally appli- 
cable to modelling in classical physics as it is for quantum 
theory, but it is equally well employed in modelling eco- 
nomics, ecological systems or the climate. As we have 
seen, many ‘laws of nature’ take the form of a system of dif- 
ferential equations. This means that on the left-hand side 
of the equation we have the changes of the system’s vari- 
ables in time and space, while on the right-hand side they 
are expressed as functions of the variables themselves, 
i.e. the point in the space of states the system could be 
in. Examples were already provided by Newton’s equa- 
tions (1.1.3) and the Maxwell equations (1.1.28). The so- 
lutions of these equations describe therefore the dynami- 
cal trajectories in the configuration space that the system 
traverses in time. The trajectory depends of course on 
the starting point or initial condition. Obtaining solutions 
to differential equations has to involve some kind of inte- 
gration because we want to get rid of the derivatives, and 
that is exactly what makes solving differential equations 
so hard. If the equations are linear, meaning that the un- 
knowns one want to solve for only appear linearly in the 
equation, solutions can often be obtained in closed ana- 
lytic form, but if the equations are nonlinear that is only 
rarely the case. 


Let us conclude this excursion by looking at two differential 
equations of particular interest, a growth/decay equation 
and a wave equation. 


Example: the equation for exponential growth or de- 
cay. We have a container with No radioactive nuclei. Then 
the remaining number N(t) at time t will decrease in time 
at a rate dN/dt. This rate will be proportional the number 
N(t), which is just saying something like, ‘if the population 
is twice as big, twice as many people will die’ So the equa- 
tion we like to solve reads: 


dN _ 


— = —\N. A. 

dt Os) 
this can be cast in the form: 

daN 

— = —\dt. A.4 

N= TA (A.4) 


Now the left-hand side and the right-hand side can be ‘inte- 
grated’, which by using Table A.1 yields the solution: 


In|N| +d =—At > N(t) = Noe ™, (A.5) 


where the constant e~4 has to equal No, the number of 
nuclei at time t = 0. Note that the solution corresponds 
to the green curve depicted in Figure A.2(e). The solution 
tells us that the decay is exponential, and we will refer to 
this result if we talk about radio-active decay in chapter 1.4. 
And if we change the sign in front of A in the equation, we 
of course get the red curve in the figure corresponding to 
exponential growth, describing some stages of epidemics 
or a post on Facebook ‘going viral. 


Example: the wave equation. This equation is of inter- 
est because waves appear all over the place in physics. 
Not just water or sound waves, also light is a wave phe- 
nomena, and also in quantum theory we encounter wave 
equations in many guises. Most prominent is the Schré- 
dinger equation, but also the Maxwell and Dirac equations 
are basically wave equations, which after quantization will 
have interpretations in terms particles. And it is here that 
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the well-known quantessential catch phrase particle-wave 
duality originates. 


In one space and one time dimension the relativistic wave 
equation takes the form of a differential equation with two 
derivatives working subsequently on a function f(x, t) of 
space and time: 

o a0 

a Oe 
The solutions for f are waves that move with a velocity 
equal +c for example: 


0. (A.6) 


f(x, t) = acos(wt — kx). (A.7) 


This solution has besides the amplitude a, two parame- 
ters, the angular frequency w = 2mv, and wavenumber 
k = 2m/X, and looks like the wave pattern of Figure 
A.2(d) moving either to the left or the right. Indeed, taking 
two derivatives means in Table A.1, that we move from the 
column on the right to the column on the left. If you put this 
into the equation and take the derivatives, you get an alge- 
braic equation w? — c?k? = 0 for the parameters w and k, 
telling us exactly, that — as advertised — there are propa- 
gating waves satisfying the equation with w = +ck which 
amounts exactly to the wave relation v = c/A. Later on 
we will see that quantization of this relation leads to the lin- 
ear dispersion relation E(p) hw chk = cp, which 
is characteristic for a massless particle. This reflects the 
similarity of the above equation with the electromagnetic 
wave equation (1.1.47). & 


On algebras 


In high school we have to learn elementary algebra, where 
one represents variables — mostly corresponding to real 


As f depends on two variables we have to distinguish the deriva- 
tives with respect to space and time, we write the curly derivative sym- 
bols called partial derivatives. The squares in the derivatives mean that 
you apply the derivative operator twice, so 07f/dt7 = (0/dt)? f. 


numbers — as abstract letter symbols, and one learns how 
to manipulate the expressions according to certain rules 
or operations that apply to real numbers, such as addi- 
tion and multiplication. The principle application is to solve 
equations by exploiting these manipulations. For example 
having the quadratic equation ax? + bx +c = 0, the ques- 
tion is to solve for the variable x in terms of the constants 
a,b and c. One proves that there are two real solutions 
given by x+ = (—b + Vb? —4ac)/2a, provided the ex- 
pression under the square root is positive. So the advan- 
tage of the abstract notation is that the answer applies for 
any choice of the constants a, b and c: it gives the general 
solution. 


Abstract algebra. Generally the subject of abstract al- 
gebra deals with collections of objects such as numbers, 
vectors, matrices, polynomials and functions for which bi- 
nary operations like addition and multiplication and pos- 
sibly more are defined (the inverse operations like sub- 
traction and division for example). The binary operations 
may or may not be distributive: a x (b +c) = a x b + 
a x c, commutative: a+ b = b+ a and associative: 
a+ (b+c) = (a+b) +c. You see that for the algebra 
of ordinary numbers both the addition and multiplication 
operations are distributive, commutative and associative 
(subtraction should be thought of as addition of a negative 
number a — b = a + (—b), and division by a number as 
multiplying by the inverse of the number). If you read the 
next Math Excursion you will find that for the algebra of 
(n x n) matrices the sum and product are distributive and 
associative, but whereas matrix addition is commutative, 
matrix multiplication is not. 


A particularly simple algebra we will use in the next chap- 
ter is the Boolean algebra of binary numbers {0,1}. The 
algebra is defined by the operations displayed in the table 
below. They are distributive, commutative and associa- 
tive. 


Algebraic structures that are widely applied in physics 
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addition multiplication 
0+0=0 Ox 0 0 
0+1=1 ©xil=0 
1+0=1 xo —0 
1+1=0 xisi 


Table A.2: The Boolean algebra. 


are vector spaces, rings, groups and spaces of functions. 
It turns out that often subjects that begin as pastimes for 
the mathematically minded end up having great practical 
use in the realms of science and engineering. What we will 
see in this book over and again is that in the description of 
quantum states the notions of vectors, complex numbers, 
and matrices arise naturally. All these ingredients have a 
specific underlying algebraic structure. We discuss the al- 
gebra of complex numbers in the Math Excursion on page 
630, while matrix algebras are described in the next Math 
Excursion. 


Of particular interest in quantum theory is the algebra of 
observables consisting of (hermitian) self-adjoint opera- 
tors or matrices. These algebras correspond to so-called 
Lie algebras, which are directly linked to the theory of Lie 
groups, which in turn describe many of the symmetries that 
play a central role in (quantum) physics. Lie algebras are 
discussed in more detail on page 634, and Lie groups in 
the Excursion on page 635. 


It is evident that math and physics have co-evolved over 
centuries leading to a situation where modern theoreti- 
cal physics makes extensive use of modern and abstract 
mathematics. It is for that reason that | have decided to 
throw in some (in fact more than average) math in this 
semi-popular account of a subject like quantum theory. <> 


Y On vectors and matrices 


The reason for exploring vectors and matrices, is that they 
play a central role in the mathematical formulation of all 
of physics and in particular in quantum physics. In clas- 
sical physics we think of positions, momenta, angular mo- 
menta and forces as ordinary three-dimensional vectors. 
These are real vectors because their entries or compo- 
nents are real numbers. In electromagnetism and relativity 
we have encountered so-called relativistic four-component 
vectors which are also real. Quantum states are repre- 
sented by complex vectors and physical observables are 
represented by a class of complex matrices. This excur- 
sion highlights some of the more important properties of 
real vectors and matrices We return to complex vectors 
and matrices, which play a central role in part II of the 
book, in a separate Math Excursion on page 632. 


Real vectors. A vector can be viewed simply as an ar- 
row of a certain length in some n-dimensional Euclidean 
space IR". Note that we also have the null-vector corre- 
sponding to the origin. We denote column vectors by ket 
vectors |w) : they are elements of a vector space V , while 
the row vectors are denoted by so-called bra vectors (v|, 
and these are elements of a dual vector space V*. We 
can add and subtract vectors by just adding or subtracting 
their corresponding components, and scale the vectors by 
multiplying them by ordinary numbers. These are familiar 
properties to most of you. 


Vector components and choice of basis. If the dimen- 
sion of the vector space is n , we can choose sets of basis 
vectors {|i) } and {(i|} and expand vectors as |v) = 2; vij) 
or (v| = >) ,vi(il. You may think of these basis vectors 
as unit vectors along the different orthogonal axes of the 
vector space. The reason for this subtle distinction be- 
tween row and column vectors is that we will encounter 
different types of vector spaces in this book. We have al- 
ready seen the example of ordinary Euclidean vectors and 
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the relativistic Lorentz vectors. The differences between 
these spaces becomes clear if we look at the definitions 
for the invariant squared ‘length’ or the inner product of 
vectors. 


The inner, dot, or scalar product. Having a vector space 
Y and its dual V* we may define an inner, dot or scalar 
product between elements v € V* and w € V as the num- 
ber obtained after adding the products of the correspond- 
ing entries: 
(vw) =v-w= > viw'. 
t 

As an example we calculate the dot product of two two-di- 
mensional Euclidean vectors: 


(2 1) (4! )=-2+t=-1. 


Taking the dot product of a Euclidian vector with itself, 
(viv) = |v? always yields a sum of squares, correspond- 
ing to a real number larger or equal zero, which is defined 
as the length of the vector, |v|, squared. We also mention 
that for real vectors the dot product is real and symmetric, 
vw = (vw) = wiv) = wev. 


As another relevant example we consider the Lorentzian 
four-momentum vector p” = (E/c, p). The inner product 
should produce the expression pup! = E?/c* — p*. This 
means that the row vector (with lower indices) should be 
Pu = (E/c,—p). It is extremely useful then to define a 
metric, which is just a matrix nj = diag(1,—1,—1,—1), 
which maps a column vector to its corresponding row vec- 
tor like vi = Yn . And therefore the inner product 
can be written using this metric as v-w = 2 giyv'w). 
For the Euclidean case this metric is just the unit matrix 
gj = 54; = diag(1,1,...,1). Observe that the value of 
the inner product of a Lorentzian four-vector with itself is 
not restricted, it can be either positive, negative or zero. 
Furthermore, if this product is zero, this does not imply 
that the vector itself has to be zero. It just means that the 
corresponding particle has vanishing rest-mass. 


We have given a graphical representation of the scalar or 
dot product of two vectors in Figure A.8(a), which under- 
scores the fact that the dot-product produces a number, 
not a vector, and for that reason it is also called the scalar 
product. 


The exterior or cross product of two vectors. In three 
dimensions one may indeed also define a ‘vector’, ‘exte- 
rior or ‘cross’ product between vectors which produces 
a vector w out of two vectors u and v, and one writes 
w =v x u. There is no simple extension of such a vector 
product to general dimensions. 


Matrices. Matrices are there in many kinds, appear all 
over the place and have zillions of applications through 
the sciences. It refers to a two-dimensional array of el- 
ements like for example the apartment building of Figure 
A.6. The entries of a matrix are often numbers that refer 
to information about the — in the example at hand — apart- 
ment: how many bedrooms, or how many people, or their 
income, their age etc. In this book we will only employ 
square (n x n) matrices that will satisfy various additional 
properties that derive naturally from certain physical re- 
quirements in the specific applications we discuss. There 
are many ways to look at a matrix: the most neutral way is 
to say that it is a square array of (real or complex) numbers 
(see Figure A.7(a)). For example a distance table between 
n cities would be like a real (n x n) matrix. Another way 
to look at a matrix would be to distinguish the set of diag- 
onal elements, the elements in the upper triangle and the 
elements of the lower triangle (figure A.7(b)). And some- 
times it is convenient to think of a matrix as a stack of n 
n-dimensional row or column vectors as indicated in Fig- 
ures A.7(c) and A.7(d). 


Matrix algebra. Now the matrices themselves also form 
a vector space, because we may add and subtract them, 
there is a ‘null-matrix’ (with all entries equal zero), and we 
may multiply a matrix by an arbitrary constant (by just mul- 
tiplying each entry of the matrix by that constant). There is 
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Figure A.6: The Matrix. A matrix is a two-dimensional array of 
elements. You may think of this apartment building as a 6 x 4 
matrix, with 6 rows and 4 columns, where the apartments are 
labeled like the corresponding matrix entries. The entries may 
refer to information about the inhabitants of the apartments, like 
the family size, their income, etc. But the analogy is of limited 
use as we are not adding or multiplying apartment buildings, or 
assign any meaninbg tot theior aigenvectors and such.. (Source: 
Alamy.) 


more, we may also define a multiplication for matrices as 
we will see shortly. And in view of the previous Math Ex- 
cursion this means that the set of n x n matrices form an 
algebra. To define division for matrices is a little more intri- 
cate: we basically define it by multiplying by the inverse of 
the matrix, where the inverse of A~! of A is defined as the 
matrix that satisfies A~-'A = AA~!' = 1, where 1 is the 
unit matrix with only ones on the diagonal. This raises the 
follow-up question of under which conditions the inverse is 
a well-defined matrix itself. And this question may remind 
you of the serious elementary school dictum: never divide 
by the number zero! For matrices the rule is that the in- 
verse exists, if the determinant of the matrix is non-zero. 
This is a number that that can be calculated given the ma- 
trix, but we will not go into detail here. Certain matrices 
have inverses and others have not and there is a relatively 


simple criterium which tells you if the inverse of a certain 
square matrix exists. Including the multiplication we speak 
of a matrix algebra, as we can perform algebraic manipula- 
tions with them similar to what we do with numbers. There 
is a well-established basic branch of mathematics called 
‘linear algebra’, and there are many textbooks covering the 
world of matrices in great detail. 


Matrix as linear transformation of vectors. Now vec- 
tors can also be multiplied by matrices to produce another 
vector, the way that is done is pictorially indicated for a col- 
umn vector inA.8(b). This action of matrices on vectors 
is clearly most easily understood if you think of the matrix 
as a stack of row vectors. The action can also be consid- 
ered as a transformation of a vector into another vector. A 
simple example may help: 


G a)G)*G5)-G): 


The matrix acts as a linear operator on the vector space, 
as it reshuffles the components into linear combinations of 
them. We may say that (n x n) matrices map the vector 
space V onto itself and we write A : V — V . There is for 
example a particular subset of (3 x 3) matrices whose ac- 
tion on ‘ordinary’ vectors corresponds to rotating of those 
vectors in three-dimensional space R? . 


Another example which shows the descriptive power of 
matrices as operators on state vectors is in (quantum) com- 
putation, where generically we think of computation as a 
sequence of gates, interactions/manipulations or measu- 
rements that change the states of a set of (qu)bits. 


Such processes or computations can be represented by 
a product of matrices. Indeed the complete computation 
is just a matrix mapping the in-state on the out-state vec- 
tor. 


Eigenvectors and eigenvalues. Given a matrix A one 
defines the eigenvectors of A as a set of special vectors 
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(a) A4x4 square matrix can be thought of as a table of 4 = 16 numbers (b) Square matrix build up of three parts, upper triangular (red), diagonal 
or symbols representing them. cae and lower triangular (blue). 


i 


=. 


) A matrix can also be viewed as a stack of row vectors. (d) A matrix can also be viewed as a stack of column vectors. 


Figure A.7: Four ways to think about a matrix. Graphical representation of the many guises of a matrix (artist impression). 
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(a) The inner, scalar or dot product of a row and a column vec- (b) The product of a matrix G with a column vector b yields again a 
tor yields the single number obtained by adding the product of column vector c obtained by taking the dot product of subsequent row 
subsequent row entries with the corresponding column entries: vectors of G with the column vector b : |c) = G/b) = G- b meaning 
(glb) = g* clo = 2G; Di Š Ci = xj Gib; a 
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(c) The matrix product. Each entry in the product matrix C equals the dot product of the i-th row vector of the first matrix A with the j-th column 
vector of the second matrix B , so Cij = LxAixBxj. 


Figure A.8: Multiplications. Graphical representation and building up of products of vectors and matrices. 


V ON VECTORS AND MATRICES 


619 


Figure A.9: Eigenvectors and eigenvalues. Given a matrix one 
defines the eigenvectors as a set of special vectors which satisfy 
an eigenvalue equation (A.8). 


{|ax)} that satisfy the following equation: 
A |ax) = ax lak) , (A.8) 


where the numbers ax are the corresponding eigenvalues. 
So acting on an eigenvector the matrix A gives that same 
vector back up to a constant, which is by definition the 
eigenvalue. This is illustrated in Figure A.9. The set of 
eigenvalues {ax} is called the soectum of the matrix. In 
quantum theory the observables are represented by Her- 
mitean matrices and in that case the eigenvalues are real 
and the spectrum is called the sample space of the opera- 
torA. 


The matrix product. Once we have defined the action of 
matrices on vectors the step to the multiplication of matri- 
ces is straightforward and we have indicated it in Figure 
A.8(c). The (ij)-entry of the product matrix C = AB is ob- 
tained by the dot product of the i—th row vectors of A with 
the j — th column vector of B . Let us again give a simple 
example: 


(DAG De ae us 


Types of matrices. A distance table between n different 
cities is a square (n x n) matrix, a rather special one for 
sure, because its diagonal elements are all zero and it is 


symmetric with respect to that diagonal: the upper diag- 
onal and lower diagonal matrices are each other’s mirror 
image. Such a matrix is completely determined by speci- 
fying its n(n — 1)/2 upper triangular entries. 


Depending on the situation we may want to put additional 
constraints which define a subset of matrices. If the ad- 
ditional properties are preserved under the basic matrix 
operations, the subset forms a subalgebra of the original 
algebra. The additional properties involve typical matrix 
manipulations which we have represented symbolically in 
figure A.10. A fundamental notion is the transpose of a ma- 
trix denoted by the matrix A™ , which is obtained from A as 
indicated in Figures A.10(a) and A.10(b), written in terms 
of its entries one has IA Jy = Aji. The transpose can be 
obtained by mirroring the matrix in the diagonal, but can 
also be obtained by interchanging rows and columns. Re- 
peating the operation brings you back to the original ma- 
trix. What happens if we take the transpose of a product of 
matrices? Referring again to Figure A.8(a), one sees that 
taking the transpose of matrix C = AB on the right-hand 
side we get a matrix which is the product of the transposes, 
but in the opposite order: C™ = BYAT, 


Now it is also straightforward to define a symmetric or an- 
tisymmetric matrix as the ones that satisfies A = +A™ 
(see Figure A.10(c)). Note that a symmetric (n x n) matrix 
contains n(n+1)/2 real numbers, while the antisymmetric 
one has only n(n — 1)/2, because the diagonal elements 
have to be zero for the latter. 


Invariance of the inner product. We have shown that 
the product of a vector with itself defines the length of a 
vector, and we all know that the length of a vector does 
not change if we rotate the vector around. So we say that 
the length of a vector is invariant under rotations. Also the 
angle between two vectors is invariant under rotations. In 
other words the inner product of two vectors is invariant 
under rotations. 
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) A square matrix build up of three parts, upper triangular (red), diago- (b) Transpose of the matrix depicted in (a), obtained by reflecting in the 
a (white) and lower triangular (blue). ea or by interchanging the rows oe coinne of the matrix. 

) A symmetric matrix is equal to its transpose. A distance table be- ) An antisymmetric matrix is a matrix whose transpose equals minus 
es four cities would be a symmetric matrix with zeros along the diag- = matrix. In other words: Aji = — Aij. 
onal. 


Figure A.10: Matrix properties. Graphical representation of some basic properties of matrices. 
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As the rotations involve a transformation of the vector into 
another vector, it follows that rotations can be represented 
by matrices acting on the vector space V . And for real vec- 
tors this matrix has to be a real matrix. Imagine we act with 
a rotation matrix R on |v) . We may write |v’) = R |v) , and 
it then follows that (v’| = (v|R‘. Invariance of the inner 
product of two arbitrary vectors now requires that 


@'lw’) = GIRTRIw) = Olw) S RERE (A.10) 


What this equation is telling us is that the matrices R that 
represent rotations have to satisfy the property that their 
transpose equals their inverse. Matrices that have that 
property are called orthogonal matrices. There is an addi- 
tional important property that these matrices have to sat- 
isfy. If you realize that if we do two subsequent rotations 
on a vector, then that is the same as doing a single rota- 
tion that brings the vector directly from its original to its final 
orientation. Translated in the language of rotation matrices 
this means that the product of two orthogonal matrices is 
again an orthogonal matrix. And one says that the col- 
lection of all such matrices define a group, for the case at 
hand this is the so-called rotation group in n-dimensions 
denoted by SO(n). The SO(n) group has n(n — 1)/2 in- 
dependent elements. 


What about the four-vectors whose inner product involves 
not the unit matrix, but rather the diagonal 4 x 4 matrix 
nuv = diag(1,—1,—1,—1)? Now we have to impose a 
different invariance condition on the transformation matri- 
ces A, it reads A n A = n. The Lorentz transforma- 
tions are defined by the condition that they leave the inner 
product matrix or metric, n , invariant. The associated, so- 
called Lorentz group is then denoted as SO(1,3), as the 
metric has one plus sign and three minus signs. Q 


@ On vector calculus 


In this excursion we touch on three important theorems 
with respect to integrating equations involving the vector 
derivative V of fields. These theorems refer respectively 
to the line integral, an integral over an area and a volume 
integral. 


Operators involving the vector derivative V . I 
We have been talking about fields such as a force field 
F(x), a current density p(x) or the electric and magnetic 
fields E(x) and B(x). Such a vector field defines a vector 
at any point in space(time). We have also encountered the 
vector of derivatives called nabla: 


Y= OORTE 


which plays a fundamental role in the calculus of (vector) 
fields which features as we have seen in the Maxwell equa- 
tions of electromagnetism, but as a matter of fact it plays 
an equally important role in the subject of fluid dynamics. If 
the equations involve the nabla operator, then solving the 
equation means that we somehow have to ‘integrate’ the 
equation. The mathematics involved is denoted as vector 
calculus in contradistinction to vector algebra, which only 
involves algebraic manipulations of vectors. 


The gradient of a scalar function yields a vector field. In 
this chapter we have encountered various definitions where 
a vector field was defined as the vector derivative or gra- 
dient of a scalar potential function V(x) , like for example 
the relations: 


F(x) = —VV(x), 
E(x) = —VV(x). 


When discussing the Maxwell equations we also encoun- 
tered vector derivatives of vector functions. Here we dis- 
tinguish the following two possibilities: 
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Figure A.11: The electrostatic potential for a dipole. This is the 
potential V(x, y), with some equipotential lines, resulting from 
two opposite charges placed placed at opposite points on the x 
axis. 


(i) The divergence of a vector field, which yields a scalar 
function, for example: 


o(x) = V- E(x). 


(ii) The curl of a vector field, which yields another vector 
field, for example: 


j= VxB, 
B>: V:A 


These operations contain first-order derivatives and are 
thus linear in nabla. We also need higher-order deriva- 
tives, apart from definitions like the ‘Laplacian’ A = (V - 
V) , there exist additional mathematical identities. In Chap- 
ter 1.2 we used already two of them: 


Wo (\Y x AN) =O, 
V <1V Vic. 


(A.11a) 
(A.1 1b) 


One more useful identity is basically rewriting the repeated 


Figure A.12: The electric dipole field. This is the dipole field 
E(x, y) corresponding to minus the gradient of the potential de- 
picted in the previous figure. We have drawn the field lines, 
these are the stream lines of the field. At any point the field is 
directed along the tangent of the line going through that point, 
and the magnitude is proportional to the density of lines around 
that point. The closed equipotential lines are projected in the 
plane and we see that the field lines are orthogonal to them. 
This means that the field lines are the projections of the lines of 
steepest descent on the surface of the previous figure. 


vector product of the nabla operator: 


Vx(VxA)=V(V-A)—-(V-V)A, (A.12) 
where the Laplacian in the last term is understood as act- 
ing on the components of vector A individually. 


We emphasize that the above are identities, meaning that 
they hold for any vector field A(x,t) and any scalar field 
Veg i) 


To solve systems like the Maxwell equations we are inter- 
ested in ‘integrating’ expressions involving the basic vector 
derivatives, this is facilitated by some powerful theorems 
that we will look at next. 
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V(x) 


Figure A.13: A line integral. In the upper picture we give a two- 
dimensional potential surface V(x) . The force field is defined as 
F(x) = —V V(x). If we choose a path from point xo to xı , we 
can integrate F along that path, meaning that we integrate the 
component tangential to the path. This line integral yields the 
value W = V(xo) — V(x1) which equals the work performed by 
the force, which in this is negative. We had to perform a force to 
go uphill and therefore the potential energy was increased. Note 
that the outcome is independent of the path chosen. 


JIS 
We have seen that the Maxwell equations are first-order 
partial differential equations for the vector fields E and B. 
That means that given the sources one could solve these 
equations by integrating them. It is here that some pow- 
erful integration theorems for vector derivatives can be ex- 
ploited. These lead to what is often called the integrated 
form of the Maxwell equations, which no longer contain 
any spatial derivatives of the fields. 


Integration theorems for vector derivatives. 


We will consider the following cases : 


(i) The line integral of a gradient field along a curve y , for 


Stokes’ theorem 


Figure A.14: A surface integral. The figure is a pictorial rep- 
resentation of Stokes’ law, which says that integrating the com- 
ponent of the curl of a vector field (V x B) orthogonal to an 
arbitrary surface, over an area A , equals the line integral of that 
vector field along the closed boundary contour ðA of that area. 


example: 


[ F(x)-dl=— [ WV (dx = Vix) Vion), 


Xo 


where the line element dl is the unit vector tangent to the 
curve. We discussed this example already in Chapter 1.1. 
In ordinary language this refers to the statement that if you 
apply a force on an object, then the integral of that force 
along a given path corresponds to the work applied to the 
object and that equals the increase of the potential energy 
of the object, as we have indicated in Figure A.13. This 
increase equals the difference of the potential energies at 
the endpoints of the path. The fact that the difference only 
depends on the endpoints means that the increase of en- 
ergy is not dependent on the particular path chosen. If you 
want to climb to the top of a mountain you can choose be- 
tween a path that is long and not so steep or a very short 
very steep path in either case you would have to deliver 
the same amount of energy. 


624 


Math Excursions 


Figure A.15: A vortex field. The velocity field v(x) of an ideal 
or free vortex around a source where the vorticity w is non-zero 
in a small region around the origin and pointing along the axis 
perpendicular into the plane of the figure. 


(ii) The surface integral of a curl over a given area A, 
known as Stokes’ theorem: 


| V xB-Ad’s =¢ B- dx, 
A 0A 


where on the left-hand side ñ is the unit vector perpendicu- 
lar to the surface element d?S , and on the right-hand side 
we integrate the vector field B along the boundary 0A of 
the surface area. This mathematical theorem is illustrated 
in Figure A.14. 


The most familiar application is in fluid mechanics where 
the vector field defining the flow is the velocity field v(x, t) . 
The vorticity w of the fluid is then defined as the curl of the 
velocity field: 

W =\V X Wo 


The simplest example is a situation where the vorticity to 
be non-zero only on the z—axis, as a constant vector in 
the positive z—direction. Then the solution for the velocity 
field is the familiar cylindrical free vortex flow around the 


Figure A.16: A tornado. A tornado is an aerodynamical flow 
pattern with vorticity and a non-zero circulation. 


z axis, corresponding to an ideal vortex. A related quan- 
tity is now the circulation of the flow as a surface integral 
of the vorticity, which then equals the line integral of the 
velocity around a closed loop bounding surface area. In 
the example where w = kz only on the z—axis, one ob- 
tains that for a loop winding once around the z—axis, the 
circulation y equals y = nk. Taking a horizontal circle 
around the z-axis we get a cylindrically symmetric, free 
vortex field with an angular velocity that drops off inversely 
proportional with the radius: v(r) = k/2zr , as depicted in 
Figure A.15. A beautiful, not so ideal vortex is the tornado 
depicted in Figure A.16. 


In electrodynamics one applies Stoke’s theorem to Am- 
pére’s law yielding 


} B-dx=|j-na’s. 
0A 


This is basically the ‘integrated form’ of Ampére’s law, the 
equation V x B = j, that was already depicted on the left 
in Figure 1.1.18. 
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Gauss’ theorem 


Figure A.17: A volume integral. The figure illustrates Gauss’ 
law states that the volume integral of the divergence (V - E) of 
a vector field E equals the surface integral of the perpendicu- 
lar component of that vector field over the closed surface (dV) 
bounding the volume V. 


Stoke’s theorem also applies to the magnetic flux through a 
bounded surface, which becomes equal to the loop integral 
of the vector potential A, which is defined by the equation 
B=VxA: 


o=|B-aas=¢ A-dx. 
0s 


(iii) The volume integral of a divergence over volume V 
known as Gauss’ theorem : 


[v Ewav=] 


GAY 


E-ad’s, 


where the integral on the right-hand side is over the closed 
surface S bounding the volume V. This theorem is de- 
picted in Figure A.17. 


We can apply it to the first Maxwell equation 1.1.26 as fol- 
lows: 


| o(x)a?V = | E-fad’S = Q, 
Vv OV 


telling us that integrating the perpendicular component of 
the electric field over a closed surface bounding a volume 
yields the total electric charge inside that volume. @ 
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. . . But ignorance of the different causes involved in the 
production of events, as well as their complexity, taken 
together with the imperfection of analysis, prevent our 
reaching the same certainty [as in astronomy] about the 
vast majority of phenomena. Thus there are things that 
are uncertain for us, things more or less probable, and 
we seek to compensate for the impossibility of knowing 
them by determining their different degrees of likelihood. 
So it is that we owe to the weakness of the human mind 
one of the most delicate and ingenious of mathematical 
theories, the science of chance or probability. 

(Laplace, 1889) 


Probabilities. A variable x can take on values, in a dis- 
crete or maybe a continuous set, a domain or a sample 
space we will denote by V = {xi}. A random or stochas- 
tic variable is one where we associate with that variable 
a probability distribution over the domain, so we introduce 
a probability function p; = p(x;) that gives the chance or 
probability that x will have the value x;. As the variable 
x always carries some value, we have to require that the 
probabilities add up to one: 
ae . (A.13) 
Given a random variable and its probability distribution, we 
can calculate the average outcome of a number of statis- 
tically independent measurements of x or for that matter 
any function f(x) of x . It is simply given by the expecta- 
tion value or average defined as: 
aj — È wiil) z (A.14) 
So for a fair dice we have that ¥ = {1,2,...,6} and pi = 
1/6 for all i, and therefore one calculates for example that 
<r> 7) ae e = E 
We can ask the same questions for the sum outcomes 
if we throw two dice, we have now to first determine the 


n=4 
pík); 
0.128; 
aad 
a 731648 o 
0.10 - 


Figure A.18: The distributions P(x, n, 6), with x = x(1) +... + 
x(n) for throwing n fair dice. For large n this symmetric distri- 
bution approaches the normal or Gaussian distribution. 


domain of x = x(1) + x(2) to obtain {2,3,...,12}. The 
probabilities for each outcome equals the number of dis- 
tinct combinations for the two dice to get the given answer. 
For example from the 6 x 6 = 36 possible combinations, 
the outcome x = 7 can be obtained in 6 distinct ways, 
namely, 


(x(1),x(2)) = (1,6), (2,5), (3,4), (4,3), (5,2), (6,1). 


So, the probability p(x = 7) = 6/36 = 1/6. One can sim- 
ilarly construct distributions P(x, n) for n dice, and these 
are depicted in Figure A.18 for an increasing number of 
dice. 
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Another important quantitative measure of a distribution is 
the standard deviation o and its square, called the variance 
or mean square deviation, which is defined as: 
0° =< (x—- <x >)? >=<x*?>-—<x>?. (A115) 
The variance is a measure of the width of the distribu- 
tion. For the dice examples one finds that for one dice 


Oo = V35/2 = 71) alicl ior the paino = 35/6 = 


2.42. 


Statistics. Having a stochastic variable one can make 
measurements at a series of times t,,, and one may study 
the frequency distribution of outcomes and compare it for 
example with a theoretically predicted probability distribu- 
tion. Here we enter the field of statistics, of statistical anal- 
ysis. The challenge of statistical analysis is to understand 
from the measurements, what the set of sample values you 
have taken tells you about the true distribution. The cen- 
tral and vital question is what conclusions can you draw 
from some experiment and with what degree of certainty 
or confidence. 


Say the length of males in cm for a certain country has a 
certain distribution H(h), which may peak around 170 cm. 
Now we can take a sample of the population and from the 
sample construct the sample distribution, which now is like 
an approximation of the real distribution, and it will not sur- 
prise you that by making the sample ever larger the ap- 
proximation will get better. It may also be that you are prob- 
ing a space of choices that people make and try to predict 
the probability of the next choices that will be made. The 
business of polling is in this category. Politicians and pub- 
lic media frequently demonstrate their ignorance where it 
comes to understanding statistics, and sometimes proudly 
so. In science, however, we have to insist on a solid un- 
derstanding of statistics to interpret what we see, or think 
to see, and in order to draw balanced and reliable con- 
clusions, taking the uncertainties which are always there, 
properly into account. 


Central limit theorem. Often one is interested in a quan- 
tity y, which is dependent on many different independent 
random variables. The height of people for example may 
be written as the sum of other random variables x'™ with 
m = 1,...,M, where each may have its own distribu- 
tion p(x'™)) . Under general conditions on the distributions 
p(x'™) the distribution P(y) we are interested in will ap- 
proach the Gaussian or normal distribution. So quantities 
that equal the sum of many random variables, which need 
not be normally distributed themselves, tend to be nor- 
mally distributed! This is as true for the velocity distribu- 
tion of particles in a gas kept at a given temperature, as it 
is for the height distributions in a population, or for the fre- 
quency of errors, but also for the minimal uncertainty wave 
packet describing a quantum particle. The importance of 
this normal distribution cannot be overstated as it pops up 
in any serious field of study. This is nicely expressed in 
the following quote of Sir Francis Galton, the Victorian pro- 
gressive, polymath, statistician, sociologist, psychologist, 
anthropologist, eugenicist, tropical explorer, geographer, 
inventor, meteorologist, proto-geneticist, and psychometri- 
cian: 


| know of scarcely anything so apt to impress the imagi- 
nation as the wonderful form of cosmic order expressed 
by the ‘law of frequency of error’ [the normal or Gaus- 
sian distribution]. Whenever a large sample of chaotic 
elements is taken in hand and marshalled in the order 
of their magnitude, this unexpected and most beautiful 
form of regularity proves to have been latent all along. 
The law. . . reigns with serenity and complete self- 
effacement amidst the wildest confusion. The larger the 
mob and the greater the apparent anarchy, the more per- 
fect is its sway. It is the supreme law of unreason. 
(Galton, 1889) 


The normal distribution depends on two parameters, its 
mean or expectation u and its variance oĉ, and it is given 
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Figure A.19: The Gaussian or normal distribution, with vari- 
ance o? = 1 andmeanu=0. 


by the following expression, 


f(x) = ce 720? (A.16) 


We have depicted the normal distribution in Figure A.19 
with its familiar bell shape. 


Statistical physics. To describe the macroscopic proper- 
ties of systems like gases, fluids, plasmas one does not 
need to know the precise properties of all individual par- 
ticles making up the system. Fortunately, because that 
would amount to solving some 10% coupled partial differ- 
ential equations. If we put the particles say in a container, 
then each of the particles has a well-defined phase space 
that is the same for all of them, but each particle may sit in 
a different corner of the phase space. Boltzmann made the 
assumption that such a macro-system may then be char- 
acterized by some distribution of the particles over phase 
space. 


For a simple gas or fluid, he introduced the distribution 
function f(x, v, t), giving the probability density for a par- 
ticle in the gas to have position x and velocity v at time 
t. This function will have some generic features. He in 


fact showed that this distribution function had to satisfy 
some fundamental equation which now carries his name. 
From f one can derive the number density distribution, 
atx; t= fies ni ad ve 


If the system is in equilibrium, one has that the distribution 
f is time independent. In a gas in equilibrium (without ex- 
ternal forces) we expect the particles to spread out evenly 
over the volume, so f will also be x independent, and be- 
cause of the interactions one expects that the energy will 
be quite equally distributed over the particles. If we keep 
the gas at a fixed temperature, so that the average en- 
ergy per particle equals 3kT/2, this leads to the well-known 
Maxwell-Boltzmann equilibrium velocity distribution: 


m \3/2 ny2 
a aaa 


which is a 3-dimensional Gaussian distribution. 


(A.17) 


Entropy. With a given distribution p, one can always asso- 
ciate a certain Gibbs-Shannon or information entropy S(p) 
with, 


S(p) = —Lipi log, pi. (A.18) 


The entropy is thus a number that you can calculate given 
a distribution. If the outcome is certain, then one has for 
one particular i that pj = 1 while the others are zero, 
and one finds that S = 0. On the other hand if the out- 
come is maximally uncertain we will have that N states 
pi = 1/N for all i, implying that the entropy will attain 
its maximal value S = log, N . Another interesting prop- 
erty is that entropy is an additive quantity, if one combines 
two independent distributions. Imagine throwing simulta- 
neously a fair coin and a fair dice with distributions p” 
and p°? , then there are 2 x 6 = 12 states with a com- 
bined distribution p = p’? x p®) . The entropies then sat- 
isfy the additive relation: S = S + S? . In other words, if 
one finds in an experiment that the additive property does 
not hold this indicates some interdependence between the 
variables, which in physical terms means that the two com- 
ponents of the system interact. It is therefore certainly pos- 
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sible to have a closed system consisting of two interacting 
subsystems, where the entropy of one subsystem actu- 
ally decreases, as long as the entropy of the other subsys- 
tem increases by an equal or larger amount, as to make 
sure that the whole systems satisfies the second law. For 
example, if one has a mixture of different particle types, 
which at some point will start binding, the bound state rep- 
resents a lower energy state, and thus in this transition 
heat will be released, which corresponds to pure entropy 
production. Here we see that on the one hand the inter- 
actions cause more structure, a higher level of order and 
thus less entropy in the particle component of the system, 
but at the same time the entropy of the system as a whole 
will increase because of the amount of heat that is pro- 
duced. 


Maximal entropy principle. If you have a certain sam- 
ple space, you may want to consider different distributions 
p'™ over that space and compare their entropies. Then 
an interesting fact is that the distribution that maximizes 
the entropy over the set of distributions {p'™)} is the best 
guess you can make, assuming that you know nothing else 
about the process or the distribution you are studying ex- 
cept that the probabilities add up to one. But in many 
cases you do know more, for example you know the aver- 
age outcome of some observable A(x), so < A(x) >= Ao. 
Then you want to maximize the entropy under the addi- 
tional constraint that < A(x) >= } ; piA(xi) = Ao, and 
that will lead to another maximal entropy distribution. So 
the maximal entropy distribution is the least biased proba- 
bility distribution under the given set of constraints. Many 
of the distributions that play an important role in nature are 
maximal entropy distributions. Let us look at some of the 
familiar cases: 


(i) We define the information entropy H({pi};{Ax}) as the 
entropy but with the constraints added with a parameter 
Ax. The trivial case is where we impose that the sum of 


the probabilities equals one: 
“Sp =M Pi =) 


We maximize H with respect to the {pi} and {Ax} by requir- 
ing the partial derivatives to be zero: 


Hpi Ax) = (A.19) 


ðH 
—)= —lnpi—11— ño = A.2 
-l= A.21 
-( $) = 2 p-1=0. (A.21) 
The first equation yields that p; is constant p; = p ; substi- 


tution in the second equation yields Np — 1 = 0, so that 
p = 1/N, corresponding to the well-known case of fixed 
energy or the micro-canonical ensemble. 


(ii) Let us now take a continuous energy type distribution 
where we know the average energy to be e*. Then we 
have to add to the expression (A.19) the constraint term 


—M(J>° pie — €*) , yielding for the first equation: 
Inp—1—Ao—Aze, (A.22) 
with solution 
ple) = Ce™. 
From the first constraint we get: 
= 1 he | C = 
| pte) ae = ci =e | =-= (A.23) 


so we learn that C = A. Substitution in the second con- 
straint yields another relation that we can solve for both 
parameters : 


@ | ee de =e". (A.24) 


Let us rewrite 
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which yields C = A = 1/e* and we obtain the simple ex- 
ponential distribution: 
ae 


ple) = ao 


(A.26) 


(iii) A similar calculation can be set up for the case where 
we have prior knowledge about the variance of the distri- 
bution, in which case one obtains a Gaussian distribution, 
like the celebrated Maxwell-Boltzmann distribution. 


The maximal entropy principle is a powerful tool for con- 
structing the optimal distribution satisfying a certain num- 
ber of constraints. And we see that this is completely con- 
sistent with our discussion of statistical mechanics in chap- 
ter 1.1. A virtue of the maximal entropy principle is that it 
nicely separates the purely statistical and the more physi- 
cal aspects in the approach to macroscopic systems. This 
approach to statistical mechanics, inspired by the work of 
Gibbs and Shannon, was introduced in 1957 by the Amer- 
ican physicist Edwin Thompson Jaynes. 


Quantum entropy. In quantum theory, probability plays 
an important role even if we consider a system consisting 
of a single particle, as its wave function or state vector is 
a probability amplitude that encodes the probability for ob- 
taining certain outcomes of measurements of an observ- 
able. Therefore probability is built in right from the start for 
any quantum system. and you expect that there is some 
meaning to the notion of entropy as well. Indeed, there is, 
the quantum entropy was defined by Von Neumann much 
in parallel with its classical precursor: 


S$ = =li plore. (A.27) 


In this expression, p is the so-called density matrix of the 
system as discussed in Chapter II.1, which represents the 
state of the system. The symbol Tr stands for the trace 
of a matrix, which equals the sum of its diagonal compo- 
nents. The Von Neumann entropy is a measure for the de- 
gree of entanglement of a multicomponent quantum sys- 
tem. & 


@ On complex numbers 


Mathematics is one of the few places where com- 
plexification often stands for simplification. 


Number systems. It is interesting to note how number 
systems have been extended through history. A natural 
starting point are the natural numbers or positive integers, 
and we know how to add and subtract them, where to 
stay within the set of natural numbers the subtraction is re- 
stricted to numbers smaller (or equal if we include zero in 
the set). We can extend the definition of subtraction to all 
natural numbers but that forces us to augment the set with 
the exquisite number ‘zero’ and the negative integers. One 
defines multiplication as an operation on the integers and 
then we see that the inverse operation called division is 
restricted and forces us to introduce the rational numbers 
or fractions. The next step is taking powers, and defining 
their inverse as taking the corresponding roots. Applied 
to positive numbers this leads to the real numbers, with 
the remark that of course all rational numbers are real but 
not the other way around, such as for example the real 
number v2. If we extend the definition of roots to negative 
numbers we are lead to the introduction of the complex 
numbers, where indeed the fundamental new element is 
the imaginary unit i = /—1. 


Definition of a complex number. A complex number « 
has a real and imaginary part x = a; + iaz , where a 
and a are both real, and i is the imaginary unit with the 
defining property i? = —1. Note that a complex number 
can therefore also be thought of as a vector in a two- 
dimensional real space also called the complex plane, by 
taking the real part as the x-component and the imaginary 
part as the y-component, and thus writing z = x + iy. 
The length of the vector is called the magnitude or abso- 
lute value of œ and denoted by ||, and the angle it makes 
with the real (x) axis is called its argument or phase. 

The complex conjugate of « is defined as x* = a; —iaz, 
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(a) Polar representation of a complex number « = pexp(ig). (b) Adding and subtracting two complex numbers « and B by the ‘paral- 
lelogram’ rule. 


(c) Multiplying two complex numbers « and B amounts to multiplying (d) The square and square root of a complex number «. Here the blue 
their magnitudes (p&g = PaPg) and adding their phase angles (pag = angle is half and the purple angle is twice the red angle. 
Qa + Pg) 6 


Figure A.20: Complex numbers. Graphical representation of some basic operations with complex numbers. 
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it is obtained by replacing i by —i . The value of |c| is de- 
fined by the relation |a|? = o*« = at + aĝ., where one 
obtains the result by multiplying out the expressions and 
remembering that —i2 = +1, so, (a; + iaz)(a; —iaz) = 
a? — i?a? = a? + a}. This indeed equals the length of the 
corresponding vector. 


Polar decomposition. There is an alternative but equiva- 
lent way to think of complex numbers explicitly using their 
two-dimensional vector property. If one thinks of a planar 
vector in polar coordinates, one may specify it by giving 
its magnitude p and the angle ọ it makes with the x—axis. 
The complex number is then written as a = pet? : the 
terminology is that ọ is called the argument or phase an- 
gle, and et? the phase factor. We see that |x| = p and 
let?| = 1. The phase factor describes therefore a point on 
the unit circle in the complex plane which makes an angle 
ọ with the real axis. This is depicted in Figure A.20(a) from 
which one also sees that the real part of the phase fac- 
tor equals cos ọ , while the imaginary component equals 
sin @ , which leads to a famous mathematical identity orig- 
inally due to Euler: 


e? —cose+ising. (A.28) 


This formula is a source of numerous amusing number the- 
oretical identities like e** + 1 = 0 and e'”/* = i. In this 
parametrization of complex numbers it is easy to perform 
complex multiplication and division and taking powers or 
roots. 


Algebraic properties of complex numbers. To add or 
subtract two complex numbers, one just adds or subtracts 
their real and imaginary parts separately: « + B = (a; + 
bı) + i(az + b2). This corresponds to adding (subtract- 
ing) two vectors in the plane by the ‘parallelogram’ rule as 
indicated in Figure A.20(b). Multiplying two complex num- 
bers a; and «2 amounts top multiplying the magnitudes, 
i.e. p = P1P2, while the phase angles add, ~ = p1 + @2 
as in Figure A.20(c). Similarly when dividing two complex 
numbers one divides the magnitudes and takes the differ- 


ence of the phase angles. Taking a complex conjugate 
amounts to replacing ~ by —@, i.e. mirroring the vector in 
the x-axis. We see that the polar representation of com- 
plex numbers makes it particularly easy to visualize the 
multiplication and division operations, but also to take their 
powers and roots, as we did in Figure A.20(d).@ 


Y On complex vectors and matrices 


We have discussed real vectors and matrices in the Math 
Excursion on page 614. But in quantum theory everything 
gets complexified, meaning to say that states are repre- 
sented by complex vectors and observables by complex 
(hermitian) matrices. Therefore we will summarize here 
some additional material specific to complex vectors and 
matrices. 


Complex vectors. Think of our vectors as column or ket 
vectors |v) which are complex, which means that the en- 
tries or components are complex numbers. Then we may 
define a space of dual vectors, the dual of a column vec- 
tor is a row or bra vector (v|, with complex conjugate en- 
tries. 


The inner- or dot-product. Having a vector space V and 
its dual V* the inner product between elements of v* € V* 
and w € VY is defined as the number obtained after adding 
the products of corresponding entries: 


(vw) =v" -w = Liviwi. 


We calculate for example the dot product of two two-di- 
mensional complex vectors as: 


(E )( |) =22+1=-1. 


The property of the inner product that, 


(wiv) = (viw)", 
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still implies that (viv) = (viv)* = |v? is always a positive 
real number which is defined to be the length of the vector 
\v| squared. 


The state space of a qubit. The state of a qubit is by def- 
inition the two-dimensional complex vector |p} of equation 
(Il.1.2). The normalization condition applied to the state 
can be written as: 


(php) = hp? = la? + |p =1. (A.29) 


If we substitute x = a; + iaz and P = bı + ib2, then we 
find 


af as by bs = I (A.30) 


This equation describes a (real) three-dimensional sphere, 
S? , embedded in the four-dimensional Euclidean space, 
Rf , with coordinates (a1, a2, b1, b2). 


Complex matrices acting on complex vectors. Now 
vectors can also be multiplied by matrices to produce an- 
other vector, the way that is done was pictorially indicated 
for a column vector in A.8(b). This action of matrices on 
vectors is clearly most easily understood if you think of 
the matrix as a stack of row vectors. This action can also 
be considered as a transformation of a vector into another 
vector. A simple example may help: 


(Sy Ga ea a) 


The matrix acts as a linear operator on the vector space, 
as it reshuffles the components into linear combinations of 
them. Or one may say that (nxn) matrices map the vector 
space V onto itself and we write A : V — V . There is for 
example a particular subset of (3 x 3) matrices whose ac- 
tion on ‘ordinary’ vectors corresponds to rotating of those 
vectors in three-dimensional complex space C? . 


Another example which shows the descriptive power of 
matrices as operators on state vectors is in (quantum) com- 
putation, where generically we think of computation as a 


sequence of gates, interactions/manipulations or measure- 
ments that change the states of a set of (qu)bits. 


Such processes or computations can be represented by a 
product of matrices and rescalings. Indeed the complete 
computation is just a big operator, mapping the in-state on 
the out-state vector. 


The matrix product. Once we have defined the action of 
matrices on vectors the step to the multiplication of matri- 
ces is straightforward and it was visualized in Figure A.8(c). 
The (ij)-entry of the product matrix C = AB is obtained 
by the dot product of the i — th row vectors of A with the 
j — th column vector of B . Let us again give a simple ex- 
ample: 


Types of matrices. As mentioned before, depending on 
the situation we usually have to put additional constraints 
defining subsets of matrices, which may or may not be 
preserved under the basic matrix operations. These def- 
initions involve certain basic matrix manipulations which 
were represented symbolically in Figure A.10. A funda- 
mental notion is the transpose of a matrix denoted by the 
matrix A‘, which is obtained from A , as we illustrated in 
Figures A.10(a) and A.10(b). Written in terms of its entries 
one has (A‘");; = Ajj . Taking the transpose can therefore 
also be defined as interchanging rows and columns. Re- 
peating the operation brings you back to the original ma- 
trix. Taking the transpose of matrix C = AB we get a 
matrix which is the product of the transposes, but in the 
opposite order: C = B'TA™ . Symmetric or antisymmet- 
ric matrices satisfy A = +A‘ respectively. Note that a 
symmetric complex matrix contains n(n+ 1) real numbers, 
while the antisymmetric one has only n(n — 1), it adds up 
to 2n? , the number of real entries of a general complex 
(n x n) matrix. 
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Hermitean matrices. Of special importance in quantum 
theory are the hermitian matrices, because they represent 
observable physical quantities. To tell you what they look 
like we first define the hermitian adjoint At as At = (At)* 
(see Figure A.10(d)). A hermitian (self-adjoint) matrix is 
just one that satisfies A = A’. It is not hard to see that a 
hermitian matrix can be decomposed in the sum of a sym- 
metric real and an antisymmetric purely imaginary matrix, 
also implying that the diagonal elements are real. Such a 
hermitian matrix contains n* real numbers. Let us give a 
simple example of the above operations for a 2 x 2 ma- 


trix: 
I 4 
a 4) 
m 1 1 C= 1 1 ; 
i -l , —i —l d 


we see that C is not hermitian because C 4 Ct. Each of 
the Pauli matrices on the left-hand side of equation (A.32) 
however is hermitian. Note however that their product is 
not. 


The Pauli matrices. Most famous are the set of three (2 x 
2) hermitian matrices, which are called the Pauli matrices 
X,Y and Z. They are defined as: 


0 1 0 =1 1 0 
x= (9 ve? A) 26() 9), rsa 


and have a quite unique combination of properties. 

(i) They are hermitian: Xt = X etc. 

(ii) They are unitary: X'X = 1. 

(iii) From (i) and (ii) it follows that they square to the unit 
matrix: X* = 1 ete: 

(iv) They form a basis of the su(2) Lie algebra, which 
means that they form a closed algebra under commuta- 
tion: 

[X,Y] = 2iZ etc. (see below). 

(v) Their anti-commutator vanishes: {X, Y} = XY+YX = 0 
etc. 


(vi) The one qubit observables are linear combinations of 
the Pauli matrices, the spin-half operators correspond to: 
S = RX 2 Gis. 

(vii) If we add the unit matrix (which commutes will all three 
of the Pauli matrices, and which is also hermitian), we get 
the algebra of u(2) ~ su(2) Bu(1). 

(viii) Every 2 x 2 unitary matrix can be written as a linear 
combination of these four matrices (see below). 


Lie algebras. I 


Hermiticity is not a property that is preserved under matrix 
multiplication, if you multiply two hermitian matrices their 
product is not in general. However, their antisymmetric 
product or commutator is hermitian, so if A and B are her- 
mitian, then: 


(i[A, B])' = —i(AB — BA)! = —i(B'At — AtBT) = i[A, B] 


In this sense the commutator of observables yields another 
observable, or to put it another way: the observables form 
a closed commutator algebra, where the ‘product’ oper- 
ation of the algebra is then defined as the commutator: 
A-B = ilA,B]. We see a splendid example of this with 
the qubit where we had three basic observables {X, Y, Z} 
that form a closed algebra under commutation: 

Bo Wea AZ MA =A [ZX] = 20, (A.33) 
this three-dimensional algebra is called su(2) . The beauty 
of the subject becomes clear if you think — for example — 
of the su(2) algebra not as a set of relations that our spin 
matrices satisfy, but as an abstract set of commutators that 
define the algebra. In general one should think of a set 
of elements X; that form the basis of the Lie algebra A, 
satisfying commutation relations: 


ol ea 
k 


the specific set of constants {fix} are the so-called struc- 
ture constants which define the Lie algebra. 
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Now you can turn the question around, and ask when given 
the structure constants, whether there exist any sets of 
matrices or other operators that actually do satisfy pre- 
cisely the above relations. This is what one calls the rep- 
resentation theory of Lie algebras, an important part of 
the mathematical theory. In physics we encounter this 
all the time, for example the su(2) algebra is basically 
the algebra of rotations in three-dimensional space.? It is 
the algebra satisfied by the angular momentum operators 
{Lx, Ly, Lz} as differential operators, but the algebra has 
also irreducible representation as (n x n) matrices for any 
n=1,2,3,....Ifwewrite n = 2s+1 then s is now defined 
as the spin, or the angular momentum, and we see that in- 
deed all half-integral and integral values are possible. And 
the integer values we see recurring as the quantum num- 
ber l in the spectra of atoms. The s = 1/2 case clearly 
corresponds to the 2 x 2 matrices S;. The complex Lie 
algebras and their ‘irreducible’ representations have been 
classified completely and form an important subject in the 
mathematics and physics literature. Y 


This algebra is defined by the commutation relations of equation 
(A.33) without the factor 2 on the right. In other words Sx = X/2 etc. 


© On symmetry groups 


Symmetries are a powerful guiding principle in identifying 
and understanding important properties of physical sys- 
tems. The notion of symmetry can be applied to objects, 
to spaces or lattices, to equations, to the degeneracies 
in the spectra of atoms and molecules, but also of the 
electron bands of materials where the ions form an un- 
derlying lattice structure. Here we limit ourselves to the 
basic mathematical background concerning the symme- 
try groups, which we will refer to throughout the book. In 
Chapter II.6 we have an extensive section devoted to the 
physical aspects of symmetries and their breaking. 


Groups: the language of symmetry. When we talk about 
order we usually refer to some regularities, some predictab- 
le pattern that has some or many symmetries. The word 
symmetry in physics has many different meanings and is 
like the word ‘snow’ for the Inuits. One speaks of finite or 
infinite, discrete or continuous symmetries. Symmetries 
of objects, of spaces, and of equations. And on another 
level one speaks of global or local, exact or approximate 
symmetries. We encountered already the notion of frame 
rotations, of space-time rotations, and of gauge transfor- 
mations. And the elaborated structure of fiber bundles as 
described in chapter I.1, involved the concept of a local or 
gauge symmetry. 


The notions just mentioned are relevant in different con- 
texts but they share the underlying mathematical concept 
of a group. Let us introduce this concept in its elementary 
easy to grasp form as a group of transformations. One can 
indeed think of transforming an object as applying some 
operation on it, like rotating it, or moving (translating) it in 
some direction, or mirroring it (like transforming your left 
shoe in your right shoe), or scaling the object by changing 
its size but not its shape. Generally we think of the group 
as acting on some vector space, where the objects, like 
fields or states, are defined as vectors. 
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Defining properties of a group. Mathematically a group 
is just a set of elements and a ‘product rule’ that satisfy 
some rather obvious axioms, and interestingly those ax- 
ioms are so restrictive that basically everything is known 
about the groups that play a role in physics. Group theory 
is a rich branch of mathematics and we will only scratch 
the surface here. 


We denote the group by G : it is a set of elements (i.e. 
transformations or operations) gi and we write G = {gi} 
and conversely gi E€ G. There are four defining proper- 
ties: 


(i) composition rule: if gı - g2 = g3 with g1,g2 € G then 
g3 € G, this composition rule is often referred to as the 
group multiplication. 


(ii) associativity: the group multiplication is associative, 
which means that the outcome of a product does not de- 
pend on the order we perform the multiplication, so, 


(g1- 92): 93 = 91 -(g2- 93) =91- 92° 93- 


(iii) identity: there always is the trivial transformation of do- 
ing nothing, it corresponds to the identity element e , which 
satisfies 

e-g=g-e=gforallg. 


(iv) inverse: as you can always transform back, meaning 
that each element g has a unique inverse g~! with 


Gg ae. 


Numbers or matrices certainly can form groups, but note 
that we only refer to a single ‘composition rule’ or ‘prod- 
uct’ of elements. They do not form a linear space, or an 
algebra. A set of objects that is closed under some kind 
of product is maybe the easiest way to think about them. 
In that sense a group is an elementary and natural notion, 
and you may be more familiar with it than you think. 


Some examples. The set of all integers n form a group 


The dihedral group D3 = Permutation group of 3 objects 
(6 elements) 


Figure A.21: The dihedral group D3 . The symmetry group of 
an equilateral triangle is the group D3 consisting of 6 elements. 
There is one threefold axis, and three twofold axes. 


G = Z where the composition rule is addition, the identity 
element is n = 0 and the ‘inverse’ of n is —n. This is 
an infinite discrete group. Note that the integers do not 
form a group under multiplication, because of the problem 
caused by the inverse operation; zero has no inverse while 
just dividing two integers brings you outside the integers in 
to the set of fractional numbers. 


The real numbers which correspond to an infinite line form 
a continuous group of translations T = R again under ad- 
dition (subtraction). Yet another example is by rotations in 
the plane. We may rotate a two-dimensional object by a 
certain angle p where 0 <  < 360°. Now the group is 
not a line but a circle, rotating by 360° is like doing nothing. 
This two-dimensional rotation group denoted by SO(2) is 
the same as the ‘phase group’, denoted by U(1). 


Let us now discuss the group of transformations that leaves 
some object (or space, or equation) invariant, in which 
case we speak of the invariance or symmetry group of that 
object. Consider an equilateral triangle like in Figure A.21; 
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it is easy to list the transformations that leave it invariant: 
(i) rotations over 120° about its center {r,1}, (ii) mirror- 
ing it through the bisector of one of the angles {s1, s2, s3}. 
This group G = {e,r, T2, s1, $2, $3} has 6 elements and is 
denoted as the dihedral group D3. This group is the same 
as the permutation group S3 of three objects. The group 
D3 readily generalizes for regular polygons (square, pen- 
tagon, hexagon....) to groups Dan. 


Another important class of groups are groups that leave 
the inner product of some vector space invariant. For or- 
dinary three-dimensional vectors, the inner product is a - 
b = |a||b|cos¢ and the invariance group is the rotation 
group SO(3). For relativistic four vectors we defined the 
inner product as a- b = a,b" = n,,a"b”, with quy = 
diag(1,—1,—1, —1), and it is invariant under the Lorentz 
group SO(1, 3) . Inthe n-dimensional complex Hilbert space 


we have state vectors and the hermitian inner product (®|Y) , 


and as we discussed in this chapter the invariance group 
is the unitary group U(n) . We will have more to say about 
the unitary groups at the end of this Math Excursion. 


Space (time) symmetries. In physics and chemistry one 
type of order refers to the situation where the atoms forma 
lattice in space and so it is of interest to look at the symme- 
tries of a lattice. If we look at a triangular lattice, or triangu- 
lar tiling of the plane like in Figure III.2.24(a), we see that 
we not just have the rotations by multiples of 60° , but also 
translations along the sides of the triangles. Those trans- 
lation can be generated? by the two basic translations tı 
and tz of the discrete translation group G = T? = Tx T. 
Note that each translation group is the same as the group 
of the integers: T ~ Z. 


Abelian versus non-abelian groups. If we now com- 
bine the rotations and the translations, we learn some- 
thing interesting about the structure of the group, namely 


“Generated means that all translations can be obtained by repeated 
application of the two basic translations. 


Figure A.22: The symmetries of two-dimensional Euclidean 
space. Picture showing that translations and rotations (of a tri- 
angular object) do not commute. It is a fact we are all familiar 
with: if you make first a step sideways and then turn, you end 
up in a different place then if you first turn and then make a step 
sideways. Formally stated: if we first translate along the bottom 
side of the triangle and then rotate over 30° , we act with r - t; , 
and we end up with the rotated triangle in the lower right-hand 
corner; if we first rotate and then translate, we act with tı - r, 
and we end up with the rotated triangle in the upper right-hand 
corner. The operations are clearly not the same. 


that the group composition rule is not necessarily com- 
mutative, which just means that in general we have that 
91:92 # 92:91 . The group is then called non-commutative 
or non-abelian. And this is clearly different from the multi- 
plication and addition of ordinary numbers which are com- 
mutative. Ordinary division is of course not, as in general 
a/b # b/a, but if you define division as multiplication by 
the inverse it is, as iy = a 

The rotations (in a plane) by themselves do commute, if | 
first rotate by an angle pı and then 2 the net result is a 
rotation by @; + #2, and that is the same as first rotating 
by 2 and then by 1 . The same is true for the translations 
by themselves asa+b=b+a. 
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It is no longer true if we combine rotations and translations 
as we did in in Figure A.22. If we choose r with @ = 60° 
and t; = na translation over n times side of a triangle, 
then both operations leave the lattice of Figure III.2.24(a) 
invariant. They belong to the invariance group of the lattice 
but correspond to different elements. The terminology is 
that we call the total invariance group of a lattice a space 
group whereas the rotational part of it forms a point group 
as it leaves a point of the space fixed. Note that if we 
think of the plane as a continuous space, usually denoted 
by R*, then the space group would be the group made 
up by arbitrary rotations and arbitrary translations; this is 
a continuous group denoted by Ez, the Euclidean group 
in two dimensions. Also this group has of course higher 
n-dimensional analogues called E, . 


Groups of matrices. There are many groups that can be 
represented by matrices, because square matrices close 
under the matrix product. Generically such groups are 
non-abelian. But one can also make restrictions to subsets 
of matrices that form closed subsets under matrix multipli- 
cation. Of special interest for us are the orthogonal and 
unitary matrices O(n) and U(n) . They act as non-abelian 
transformation groups of rotations on the real and complex 
spaces R” and C”. The matrices satisfy O O' = 1 and 
u Ut = 1 respectively. 


The group SU(2) of 2 x 2 unitary matrices. II 
Let us add an important remark on the relation between 
hermitian and unitary matrices. Let me recall the Euler for- 
mula for the exponential of imaginary number ‘ig’ (A.28): 


e'? = cos ọ +isin ọ. 


The sine and cosine appearing show that it is indeed a pe- 
riodic function, and therefore we choose an angular vari- 
able @. You might wonder whether similar formulas can 
be written down for matrices. The answer is a full fledged 
yes, and that brings us to the relation between Lie alge- 
bras and Lie groups. Let me give you the extremely useful 


generalization of the Euler formula to the Hermitean (2 x 2) 
matrices. Consider an su(2) matrix,” 


A = (AyX/2 + flyV/2 + f1,Z/2), 


where fi is some arbitrary vector of unit length and 8 some 
angular variable, then in general the following relation holds: 


eA — 1cos 0/2+iAsine/2. (A.34) 


This elegant equation has many applications in all venues 
of theoretical physics, and we will use it repeatedly later 
on. It does for example represent a rotation of a two- 
component spinor over an angle © around the fi axis, with 
the peculiar but characteristic property that a rotation by 
8 = 27 of any spinor maps it to minus itself. As mentioned 
before, that is a property that distinguishes spinors from 
‘ordinary’ vectors. One thing that is immediately clear from 
the above formula, is that the expression corresponds to a 
unitary matrix. This holds in general: if we write a matrix 
U as an exponential of a hermitian matrix A, then we can 
write: 


ut = (eit T gA gA 


U A) 
which shows that U is a unitary matrix. This property that 
exponentials of hermitian matrices are unitary operators is 
widely used in quantum theory, in particular in the theory 
of (unitary) representations of symmetry groups that act on 
the Hilbert space of a system. 


So to summarize this part we saw a close relationship 
between the ‘algebra of observables’ for a quantum sys- 
tem, being a Lie algebra, i.e. a closed commutator alge- 
bra, which when put in the exponent yields a correspond- 
ing Lie group. In that sense we say that the observables 
(the Lie algebra) generate small or infinitesimal transfor- 
mations, while the exponents (elements of the Lie group) 
correspond to finite transformations. 


5We have mentioned before that half the Pauli matrices 
{X/2, Y/2, Z/2} do form a basis for the angular momentum or spin al- 
gebra, as they satisfy [Sx, Sy] = iSz etc. 
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Invariants. There are two more properties of matrices we 
want to discuss: these are what are called invariants under 
basis transformations. First observe that we may rotate 
the basis of a vector space. Then the components of the 
vector change and are obtained by acting with the corre- 
sponding matrix U. In the main text we showed that basis 
transformations have to preserve the scalar product of two 
arbitrary vectors and therefore will satisfy the unitarity con- 
dition UU = 1, and therefore Ut = UT! . So if we have 
a matrix operator A acting on vectors in a given frame and 
we ask what the matrix looks like in the rotated or ‘primed’ 
frame we can see that from the following algebraic manip- 
ulations. First we define: 


hb’) = U hp) and |) = Alp) , 


which allows us to write: 


|b’) = Ulp) =UAhp) 
UAU Up = UAU hp’) = A’ hp’) 


Implying that A’ = UAU! . Given these expressions for 
how state vectors and observables transform under unitary 
basis transformations, you might ask whether there are 
any quantities related to these observables that are pre- 
served under such transformations. The answer is affirma- 
tive: the invariant quantity corresponds to the set of eigen- 
values, particularly the sum and the product of all eigen- 
values, denoted as the trace and the determinant. 


The trace of a matrix A denoted by trA is defined as 
the sum of the diagonal elements, so tr A = L;Ai;. The 
trace is indeed invariant under basis transformations as 
one easily sees: 


tr A’ = tr (UAU!) =tr (UTUA) =tr A. 
The trace satisfies the cyclic property meaning that the 


trace of a product is invariant under cyclic permutations, 
i.e. that is putting the matrices in the trace on a circle hold- 


ing hands and moving them around: 


tr (ABC) = Lie (AyBurCr) = 


=r 
= Lijk (Cri Ai Bjk) = tr (CAB) etc. 

The point is that all indices are pairwise summed over. We 
will see that the trace, because it is frame independent, 
plays an important role in certain aspects of quantum the- 
ory. © 
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Appendix B 


Chronologies, ideas and people 


In this appendix we list the scientific achievements in the quantum domain over more than a century as well as the names 
and the dates of the Nobel prizes that were awarded for these. It demonstrates the fact that quantum is everywhere and 
overtook progress in physics to a large extent. 
The tables cover the following topics: 

B.1 Foundational concepts and their protagonists 

B.2 Turning points in quantum condensed matter theory 

B.3 Turning points in elementary particle theory 

B.4 Nobel prizes awarded for discovery of fundamental particles 


B.5 Nobel prizes for astrophysics and cosmology 


B.6 Nobel prizes awarded (from 1944 onwards) for the invention and development of new techniques and devices 


644 Chronologies, ideas and people 


ol » 


Figure B.1: The early quantum giants at the fifth Solvay conference, held in Brussels in 1927. On that occasion quantum mechanics, 
including the ‘Copenhagen interpretation’, was presented as a complete and final theory of atomic phenomena. 


Chronologies, ideas and people 


Deutsch, Shor 


Table B.1: Foundational quantum concepts and their protagonists. 


The person Year The concept The mathematical statement 
Planck 1897 Planck’s constant ih = nf 2n 

1900 | Black-body radiation o(v,T) = puis hy 

c (ehv/kT — 1) 

Einstein 1905 Photoelectric effect, the photon ESTY 
Bohr 1913 | Atomic model En ~ħ?e?/2me?n? 
De Broglie 1923 | Matter waves A =h/mv 
Einstein, Podolski, Rosen | 1920 | EPR paradox, entanglement hb(1,2)) = (100) | 11))/V2 
Bose, Einstein 1924 | Quantum statistics, Bose condensate | ni = gi/(e®'*+—#) — 1) 
Pauli 1924 Exclusion principle w(x1, x2) = —tb(x2, x1) 
Heisenberg 1925 | Matrix mechanics dA/dt = iff, Â] 

1927 Uncertainty relations Ax Ap >h/2 
Von Neuman 1925 Density matrix, quantum entropy O= Zale) Wal, S =e (olina) 
Schrödinger 1926 | Wave mechanics iñ dyp/dt =Ap 
Born 1926 | Probability interpretation P= Dc v= Peale 
Fermi 1927 | Quantum statistics for fermions ni = gi/(ePle:—¥) + 1) 
Dirac 1927 | Dirac equation (RØ + eX+ m) (x, t) = 0 
Bell 1964 Bell inequality [Pela b) — Pe(a,c)| < 1+ P.(b,c) 
Bennett, Brassard, >1980 | Quantum information/computation key distribution, teleportation, 


prime factoring algorithm 
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Kamerling Onnes 

Bloch 

Uhlenbeck, Goudsmit 

Van Vleck 

Kapitza, Allen, Misener 
Pauling 

Rabi 

Purcell, Bloch 

Bardeen, Houser, Brattain, Shockley 
Gabor 

Landau 

Bardeen, Cooper, Schrieffer 
Townes, Basov, Prokhorov 
Anderson 

Ahoronov, Bohm 

Haldane, Kosterlitz, Thouless 
De Gennes 

Laughlin 

Berry 

Cornell, Wiegmann 

Kitaev, Wen 

Lauterbur, Mansfield 

Geim, Novoselov 

Aspect, Clauser, Zeilinger 
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Superconductivity (experiment) 
Conduction band 

Spin 

Theory of magnetism 

Superfluidity 

The nature of chemical binding 

Nuclear magnetic resonance (NMR) 
NMR (implementations) 
Semiconductors, Transistor 

Holography 

Fermiliquids, quasiparticles, phase transistions 
BCS theory of superconductivity 

Laser 

Localization 

Aharonov-Bohm effect 

Topological phase transitions 

Liquid crystals (mostly classical physics) 
Theory of Fractional Quantum Hall effect 
Berry phase 

Bose Einstein condensation (experiment) 
Topological order 

Magnetic resonance imaging (MRI) 
Graphene 

Entangled photons (experiments) 


Table B.2: Turning points in quantum condensed matter (theory) and quantum optics. 


1911 
1920 
925 
1935 
1938 
1939 
1946 
952 
1950 
1950 
1952 
H957 
1958 
1958 
1959 
1973 
1974 
1983 
1984 
198s 
1987 
2003 
2004 
>1980 
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Feynman, Swinger, Dyson, Tomonaga 
Yang, Mills 

Gellmann, Zweig 

Nambu, Jona Lasinio 

Glashow, Weinberg, Salam 
Higgs, Brout, Englert 

't Hooft, Veltman 

Wilson 

Gellmann, Leutwyler, Fritsch 
Gross, Politzer, Wilczek 

Witten, Schwarz, Green 

Polyakov, Belavin, Zamolodchikov 
Witten 

Maldacena 


Table B.3: Turning points in Elementary particle theory. 


Quantum electrodynamics (QED) 
Non-Abelian gauge theory 

SU(3) Quarks 

Chiral symmetry breaking 

Weak and electromagnetic theory 

Higgs mechanism 

Renormalization of non-Abelian gauge theories 
Theory of critical phenomena, confinement 
Quantum Chromodynamics (QCD) 
Asymptotic freedom 

String theory 

Conformal Field Theory (CFT) 

Topological Field Theory 

Anti de Sitter/CFT correspondence 


1946 
1954 
1963 
1965 
1968 
1969 
1970 
i972 
1971 
1973 
1983 
1983 
1983 
i995 
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Röntgen 

Becquerel, Curie, Curie 
Thomson 

Rutherford 

Planck 

Einstein 

Compton 

Chadwick 

Anderson 

Powell 

Chamberlain, Segre 
Richter, Ting 

Rubia, Van der Meer 


Lederman, Schwartz, Steinberger 
Friedman, Kendall, Taylor 


Perl 
Reines 
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X-rays 

Radioactive decay (« and 8 radiation) 
Electron 

Nucleus 

Quanta of radiation 
Photon 

Compton effect 
Neutron 

Positron 

Pion 

Antiproton 

J/Psi meson 

W and Z bosons 
Muon neutrino 
Quarks 
Tau-neutrino 
Neutrino 


Table B.4: Nobel prizes awarded for discovery of elementary particles. 


1901 
1903 
1906 
1908 
1918 
1921 
1927 
1935 
1936 
1950 
959 
1976 
1984 
1988 
1990 
995 
995 
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Bethe 

Ryle, Hewish 

Penzias, Wilson 
Chandrasekhar, Fowler 
Hulse, Taylor 

Davis, Koshiba 

and Giacconi 

Mather, Smoot 

Perlmutter, Schmidt, Riess 
Thorn, Weiss, Barish 


Energy production in stars 

Pulsars 

Microwave background radiation 
Theories of star evolution 
Precision tests of gravity 

Cosmic neutrino’s 

X-ray sources 

Anisotropy in background radiation 
Accelerated expansion 
Gravitational wave detection 


Table B-7: Nobel prizes for astrophysics and cosmology. 


1967 
1974 
1965 
1983 
1993 
2002 


2006 
2011 
2017 
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Rabi Nuclear magnetic resonance 1944 
Bridgman Apparatus to produce extremely high pressures 1946 
Blackett The Wilson cloud chamber method 1948 
Powell Photographic method of studying nuclear processes 1950 
Bloch and Purcell Nuclear magnetic precision measurements 1952 
Zernike Phase contrast microscope 1953 
Glaser Bubble chamber 1960 
Shockley, Bardeen and Brattain | Transistor 1956 
Alvarez Hydrogen bubble chamber and data analysis techniques 1968 
Gabor Holographic method 1971 
Ryle and Hewish Radio astrophysics 1974 
Bloembergen and Schawlow Laser spectroscopy 1981 
Siegbahn High-resolution electron spectroscopy 1981 
Ruska Electron microscope 1986 
Binnig and Rohrer Scanning tunneling microscope 1986 
Ramsey Separated oscillatory fields method and its use in atomic clocks 1989 
Dehmelt and Paul lon trap technique 1989 
Charpak Multiwire proportional chamber 1990 
Brockhouse Neutron spectroscopy 1994 
Shull Neutron diffraction 1994 
Alferov and Kroemer Semiconductor heterostructures, high-speed- and opto-electronics 2000 
Kilby his part in the invention of the integrated circuit 2000 
Hall and Hansch Laser-based precision spectroscopy, optical frequency comb technique | 2004 
Kao Light transmission in fibers for optical communication 2009 
Boyle, Smith invention of imaging semiconductor circuit - the CCD sensor 2009 
Fert and Grünberg Giant magnetoresistance 2007 
Haroche, Wineland Measuring and manipulation of individual quantum systems 2012 
Akasaki, Amano, Nakamura Bright blue light-emitting diodes 2014 
Weiss, Barish, Thorne Gravitational wave detector LIGO 2017 


Table B.6: Nobel prizes awarded (from 1944 onwards) for the invention and development of new techniques and devices. 


Indices 


Subject index Volume III 


absolute value, 630 
abstract algebra, 613 
action, 557 

AdS-CFT, 540 

agents, 488 

aggregation levels, 467 
ALICE, 494 

amino acid, 477, 480 
annealing, 490 

anomalies, 582 
anomalous scaling, 560 
anti-ferromagnet, 501, 504 
anyons, 493, 536 
argument, 630 
associative, 613 
asymptotic freedom, 570 
atomic field microscope, 508 


bare values, 577 
baryons, 571 

basis vectors, 614 

BCS theory, 533 
benzene ring, 477 

beta function, 569 
bifurcation diagram, 552 
Big Bang cosmology, 469 


biomaterials, 493 

Bose-Einstein condensation, 
532 

Boolean algebra, 613 

Bose condensate, 499 

braid statistics, 537 

breaking of supersymmetry, 
572 

Brillouin zone, 524, 526 

bubble nucleation, 497 

buckyball, 477 

Burger’s vector, 514 


Callan-Symanzik equations, 
566 

Cantor set, 546 

Cantor’s function, 546 

Carbon, 476 

Carbon dioxide, 475 

CERN, 494 

Chern-Simons theory, 538 

chiral symmetry breaking, 
571 

Circle Limit Il, 547 

cobwebs, 552 

coexisting phases, 497 


collective behavior, 467, 499 
collective of electrons, 498 
coloids, 493 
commutative, 613 
complex conjugate, 630 
complex numbers, 630 
complex vectors, 632 
conduction band, 527 
conductor, 527, 528 
conformal algebra, 555 
conformal invariance, 504 
constituents, 488, 489 
Cooper pairs, 531 
correlation function, 504 
correlation functions, 489 
correlation length, 504 
cosmic abundances, 470 
Cosmic evolution, 469 
cosmic inflation, 469 
counter terms, 578 
covalent binding, 473 
critical exponent, 504 
critical phenomena, 572 
critical point, 491, 572 
crystal lattice, 507 
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crystals, 493 
cubic-face-centered, 507 
cuprates, 540 

Curie point, 500 

Curie temperature, 495 
curl, 622 

cyclopentane, 477 


dark matter, 470 

defects, 489 

depletion layer, 529 

derivative, 608 

determinant, 616, 639 

deterministic chaos, 551 

deterministic chaos , 544 

diagrammatic expansion, 
573 

diamond, 477 

diamond lattice, 478, 510 

differentiable, 607 

differential equations, 612 

dihedral group, 637 

dimensional analysis, 543 

dipole field, 622 

disclinations, 489, 506, 512 

dislocations, 489, 506, 512 

dispersion, 609 

distributive, 613 

divergence, 622 

DNA molecule, 479 

domain walls, 505 

doped semiconductor, 529 

doping, 490 

dot product, 615, 632 

dual representation, 505 

dynamical Lie algebra, 555 


effective action, 560, 562, 
575 

effective degrees of freedom, 
489 


effective Lagrangian, 563 

eigenvalues, 619 

eigenvectors, 616 

electron/positron propagator, 
558 

emergent behavior, 468 

emergent phenomena, 487 

energy bands, 523 

energy gaps, 524 

epigenetics, 481 

equation of state, 491 

equiangular spiral, 545 

Euclidean group, 512, 638 

Euclidean path integral, 561 

exclusion, 379 

expectation value, 626 

exponential growth, 612 

exterior or cross product , 
615 

external control parameters, 
488 

external parameters, 468 


family structure, 582 
fat tails, 495 
Feigenbaum-Cvitanovic function, 
553 
Fermi level, 526, 527 
Fermi liquid phase, 540 
ferromagnetic phase, 501 
Feynman diagrams, 561 
Feynman rules, 573 
Fibonacci tiling, 519 
Fibonaci spiral, 544 
finite transformations, 638 
first-order transition, 496 
Fisher—Wilson fixed point, 
564 
Fisher-Wilson infrared fixed point, 
569 
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fivefold symmetry, 516 

fixed point, 568 

fractal, 546 

fractals, 519 

fractional quantum Hall, 493 

fractional spin and statistics, 
536 

free energy, 497, 562 

free field theory, 558 

fullerene, 478 

function, 607 

Function classes, 607 

fundamental domain, 526 


gapless, 503 

gapped, 527 

Gaussian distribution, 627 

Gellman-Low equation, 566 

gels, 493 

genetic code, 481 

genotype, 481 

Golden Mean, 519 

Goldstone modes, 512 

gradient, 621 

graphene, 478, 479 

graphite, 479 

group of transformations, 
635 

Group theory, 636 


hadrons, 571 

half-vortex, 515 

Hall-conductivity, 536 

Hall-resistance, 536 

harmonic oscillator, 612 

Hausdorff dimension, 546 

hermitian matrix, 634 

high Tc superconductivity, 
540 

holonomy, 514 
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Hopf algebra, 539 
human genome, 479 
hyperbolic plane, 549 


ideal gas law, 491 

imaginary unit, 630 

information entropy, 628 

infrared slavery, 571 

initial conditions, 611 

inner product, 632 

insulator, 527, 528 

integer quantum Hall effect, 
535 

Integration, 610 

integration theorems, 623 

interaction potential, 472 

interaction vertex, 558 

interstitials, 490 

intrinsic semiconductor, 527 

intrinsically fault tolerant, 
539 

invariants, 639 

irrelevant, 563 

Ising model, 501 


Landau pole, 570 

Landau theory, 502 

Laplacian, 622 

large-scale structure, 470 

lattice defects, 489 

lattice vibrations, 493 

Lie algebra, 634 

Lie groups, 638 

Light Emitting Diode (LED), 
529 

line integral, 623 

linear algebra, 616 

liquid crystals, 493, 514 

liquids, 493 

logarithmic spiral, 545 


logistic map, 544, 551 

Lorentz transformations, 
621 

Lorentzian four-vector, 615 


macroscopic media, 468 

Magnetic levitation, 533 

magnetization, 490, 495, 
500 

magnons, 500 

marginal, 563 

matrices, 615 

matrix algebra, 616 

matrix product, 633 

maximal entropy distribution, 
629 

mean square deviation, 627 

measure zero, 546 

Meissner effect, 533 

mesons, 571 

mesoscopic, 477, 493 

methylation, 481 

mutual statistics, 538 


n-doping, 529 
nano-science, 477 
nano-tube, 477 

nanotube, 478 

nematics, 514 

non-abelian groups, 637 
nonlinear sigma model, 571 
normal distribution, 627 
nucleon synthesis, 470 
number systems, 630 


octahedral group, 508 
order and disorder, 489 
order parameter, 495 
order parameters, 489 
orthogonal matrices, 621 
orthohedral group, 509 
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p-doping, 529 

panther chameleon, 544 

parity, 511 

path integral, 562 

path integral approach, 561 

Pauli exclusion principle, 
537 

Pauli matrices, 634 

Penrose tiling, 519 

period doubling, 544, 551 

periodic potential, 524 

permutation group, 637 

perturbation theory, 573 

phase diagram, 489, 490 

phase transition, 468, 489, 
491 

phenotype, 481 

phonons, 489, 531 

photo-voltaic cell, 529 

photon propagator, 558 

plasma, 494 

pn-junction, 528 

Poincaré disc, 549 

point group, 507, 508 

Polar (or ion) binding, 473 

Polar decomposition, 632 

polymers, 477, 493 

power counting, 559 

power laws, 495, 504 

primordial nucleosynthesis, 
470 

primordial soup, 469 

probability distribution, 626 

protein, 481 

proteins, 477 


QCQ, 570 

QED, 558 

Quantum Chromodynamics, 
570 
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quantum critical point, 539 

Quantum Electrodynamics, 
558 

quantum fluctuations, 577 

quantum group, 539 

quantum Hall effect, 534 

quantum Hall fluid, 536 

quantum partition function, 
561 

quark-gluon plasma, 469, 
494 

quasi-particles, 489, 531 

quasicrystal, 493, 511 

qubit, 633 

quenching, 490, 506 


random, 626 

real numbers, 630 

real vectors, 614 

regularization, 578 

relevant, 563 

renormalizability, 560 

renormalizable, 564 

renormalization, 578, 580 

renormalization group equation, 
564, 566, 572 

renormalization group trajectory, 
564 

residual electromagnetic 
interactions, 472 

ribosomes, 481 

rotational defect, 514 

Runge-Lenz vector, 554 

running coupling, 566 


scalar * theory, 568 

scalar product, 615 

scale transformation, 554 

scaling dimension, 545, 555, 
559 

scaling operator, 554, 555 


scaling violations, 560 
scanning tunneling microscope, 
508 
Schafli pair, 549 
second-order transition, 495 
self-adjoint operators, 614 
self-interactions, 558 
self-similar, 519, 546 
semiconductor, 527, 528 
Sierpinski triangle, 547 
smectic, 516 
smooth, 608 
soft matter, 493 
solid phases of water, 492 
solid state physics, 493 
solvents, 490 
space groups, 507 
spin waves, 500 
spontaneous symmetry breaking, 
512 
standard deviation, 627 
Standard Model, 571 
statistical analysis, 627 
stochastic variable, 626 
Stoke’s theorem, 625 
strange metal, 539 
structure constants, 634 
subtraction, 578 
superconductivity, 493 
superconductor, 532 
superfluidity, 493, 532 
surface integral, 624 
symmetry group, 635 


The Devil’s Staircase, 546 

thermodynamic parameters, 
490 

tipping point, 497 

tipping point , 491 

tipping points, 468 

topological defect, 505 
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topological dimension, 545 

topological field theories, 
538 

topological insulators, 539 

topological order, 493, 537 

tornado, 624 

toy model, 558 

trace, 639 

translational defect, 514 

transpose, 619, 633 

triple point, 491 

Type II superconductor, 534 


uniaxial nematic, 516 
unit cell, 507, 526 
unitary group, 637 
universality, 504 


vacuum energy, 469 

vacuum polarization, 569 

valence band, 527 

valence electrons, 473 

Van der Waals binding, 472 

Van der Waals equation, 
495 

Van der Waals force, 472 

variance, 627 

vector calculus, 621 

vector derivative, 621 

volume integral, 625 

Von Neumann entropy, 630 

vortex, 624 

vorticity, 624 


wallpaper groups, 507 
wave equation, 612 
wavenumber, 613 
Wigner-Seitz cell, 526 
Wilson approach, 564 


X-ray diffraction, 496 
X-rays, 508 
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Anderson, P.W., 533 


Bacon, Francis, 596 
Bardeen, John, 533 
Boltzmann, Ludwig, 628 
Bose, Satyendra Nath, 532 
Bostrom, Nick, 601 

Bragg, William Henry, 508 
Bravais, Auguste, 507 


Conway, John, 595 
Conway, John Horton, 519 
Cooper, Leon, 533 
Cornell, Eric, 532 

Coxeter, H.S.M. , 550 
Crick, Francis, 480 


de Gennes, Pierre-Gilles, 493, 
516 
Dirac, Paul, 598 


Einstein, Albert, 532, 598 
Escher, Maurits, 549 
Everett, Hugh, 592 


Feigenbaum, Mitchell J., 
551 

Fisher, Michael, 572 

Franklin, Rosalind, 480 

Fuller, Buckminster, 477 


Galton, Francis, 627 
Geim, Andre, 479 
Ginzburg, Vitaly, 533 
Gore, Al, 600 

Gross, David, 570 


Hall, Edwin, 534 
Harari, Yuval, 601 
Heisenberg, Werner, 598 


Ising, Enst, 501 


Jaynes, Edwin Thomson, 
630 


Kadanoff, Leo, 572 

Kamerlingh Onnes, Heike, 531, 
532 

Kapitza, Pjotr, 499 

Ketterle, Wolfgang, 532 


Landau, Lev, 533 
Laughlin, Robert, 537 
Lawrence, William, 508 
Lenz, Wilhelm, 501 


Meissner, Walther, 533 
Novoselov, Konstantin, 479 


Onsager, Lars, 501 


Penrose, Roger, 519 
Perutz, Max, 508 
Politzer, David, 570 


Schrieffer, Robert, 533 
Schrödinger, Erwin, 598 
Shannon, Claude, 630 
Shechtman, Daniel, 516 
Störmer, Horst, 537 
Steinhardt, Paul, 519 


Tegmark, Max, 601 

Thomson, D’arcy Wentworth, 
545 

Tsui, Daniel, 537 


van der Waals, Johannes Diderik, 
472 

von Klitzing, Klaus, 537 

Von Neumann, John, 630 


Watson, James D., 480 
Wieman, Carl, 532 
Wigner, Eugene, 511 
Wilczek, Frank, 537, 570 
Wilkins, Maurice, 480 
Wilson, Kenneth, 562, 572 
Witten, Edward, 538 
Wolfram, Stephen, 595 
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